Does order play a role if the metric is precomputed #720

larsmoe · 2021-07-13T10:33:01Z

larsmoe
Jul 13, 2021

Hey,

I used UMAP on a precomputed distance matrix and applied a clustering algorithm on the resulting embedding. The use case is that I have around 4000 different foods from a lot of recipes. Then I computed a distance matrix based on a recipe-food-matrix. Thereby, I noticed, that while having a lot of useful cluster (i.e. a vegetable cluster), I also got some cluster, which seem to be rather non-sense and only alphabetically like milk, meat, mince, etc. I do not give the UMAP algorithm the information directly, but my distance matrix is sorted alphabetically. So the first vector is the distance of acerola to all other foods and the last vector ist the distance of zucchini to all other foods. So I was wondering whether it is possible that the UMAP algorithm implicitly uses the order of the input matrix, so that is more likely that foods that are close to each other in the distance matrix are close together in the embedding.

Thank you in advance :)

lmcinnes · 2021-07-13T13:09:11Z

lmcinnes
Jul 13, 2021
Maintainer

Mostly order won't matter, but if you have a lot of ties in short distances then order may make a difference.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does order play a role if the metric is precomputed #720

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Does order play a role if the metric is precomputed #720

larsmoe Jul 13, 2021

Replies: 1 comment

lmcinnes Jul 13, 2021 Maintainer

larsmoe
Jul 13, 2021

lmcinnes
Jul 13, 2021
Maintainer