UMAP on large data set #967
Unanswered
jovanaarsic
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Dear all,
I need a little bit of help. 🤗
The task I have to solve is the clustering task on a large data set (aka. 6 mil rows/20 cols) with a mixtured data types.
The primary idea was to apply UMAP->HDBSCAN.
In accordance with my understanding, I've applied UMAP on subset. With idea of transformation the rest of data set.
OOM popped up during the process of fitting the umap on 100K.
The set of questions related to UMAP would be:
Additionally,
Does anyone have experience with a similar problem, solved with different approach?
All the advices, suggestions and experiences are welcoming. 🤗
Beta Was this translation helpful? Give feedback.
All reactions