Replies: 1 comment
-
Mostly order won't matter, but if you have a lot of ties in short distances then order may make a difference. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hey,
I used UMAP on a precomputed distance matrix and applied a clustering algorithm on the resulting embedding. The use case is that I have around 4000 different foods from a lot of recipes. Then I computed a distance matrix based on a recipe-food-matrix. Thereby, I noticed, that while having a lot of useful cluster (i.e. a vegetable cluster), I also got some cluster, which seem to be rather non-sense and only alphabetically like milk, meat, mince, etc. I do not give the UMAP algorithm the information directly, but my distance matrix is sorted alphabetically. So the first vector is the distance of acerola to all other foods and the last vector ist the distance of zucchini to all other foods. So I was wondering whether it is possible that the UMAP algorithm implicitly uses the order of the input matrix, so that is more likely that foods that are close to each other in the distance matrix are close together in the embedding.
Thank you in advance :)
Beta Was this translation helpful? Give feedback.
All reactions