-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize clump
#656
Comments
Use the fact that when comparing checking whether a collection of global clump IDs is shared / overlaps with the collection used in neighbouring partitions, we can stop comparing once there is not overlap. We don't have to compare each collection with each other collection. More distant collections are more likely to not contain clumps that should be merged with clumps in the current partition. Strategy is to decrease the number of times sets need to be compared. |
Clump contains a serial step to stitch local clumps, determined in parallel, together. Part of this serial steps is the most expensive step of the whole algorithm, and it prevents good performance and scalability. Revisit the code and try to make this step less expensive.
The text was updated successfully, but these errors were encountered: