The crash was indeed not intended - my mistake! Should be fixed now.
You've got the cluster semantics spot on, to be honest. Broad genres are grouped together, with a tendency for sub-genres to be grouped locally within those.
There is no interpretation of the overall shapes or the global structure, those are more a result of a particular UMAP run than inherent in the data.
Would love to provide different views on it and go more in depth next, thanks for the suggestion.
Hey, thanks for reporting - this is fixed now. I messed up the static build and some browsers freaked out. By law of showing things publicly, I of course only tested in a browser that didn't. Hope you can give it another chance!
My apologies for that! First time deploying Svelte Kit to Cloudflare Pages, and I messed up the static build. Should be fixed now, hope you can give it another shot.
The cluster memberships that come out of the first round are distributions over the different clusters, e.g. a given book is weighted 0.8 for cluster A and 0.2 for cluster B. The Hellinger distance is well-suited to quantify the difference between two distributions like that. Cosine similarity and Euclidean distance worked as well, but Hellinger gave subjectively nicer results.
Very interesting question, I'm not sure! While developing, I noticed that the systems thinking books were spread over different genres, which I found quite pleasing. However, I'm not sure if other books were even more diffuse. I'll have to dig back in and find out :)