-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EventLoop: store Timers in min Pairing Heap #15206
base: master
Are you sure you want to change the base?
EventLoop: store Timers in min Pairing Heap #15206
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't comment on how it is used, but I have spent a bit of time polishing a paired heap, so some comments on that..
src/crystal/pointer_pairing_heap.cr
Outdated
end | ||
|
||
# Twopass merge of the children of *node* into pairs of two. | ||
private def merge_pairs(a : T*) : T* | Nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How big heaps are allowed? This implementation will create a runtime stack n
elements deep where n
is the amount of elements added since the latest shift?
or delete
. That is possibly not an issue if it is hundreds of elements, but with many thousands of elements queued at the same time may become a problem. Compare https://github.com/yxhuvud/pairing_heap/blob/master/src/pairing_heap.cr#L95 which also is a bit more explicit in the code about what the two faces are (the two while loops).
(fun fact: This is where almost all the time of the algorithm is spent - the algorithm has decent throughput, implementation complexity and time complexities, but it do suffer a bit from latency spikes, which can happen if a lot is enqueued at once)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, this is the current issue. With one million timers enqueued at once, no delete and thus no rebalance, it fills the 8KB of my thread main stack.
Thanks for the help, I'll look into it. It's much simpler!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! Thank you @yxhuvud 🙇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, thank you. Only thing I'm not happy about here is that it won't be in a released version of crystal before advent of code starts, meaning I still will have to copy-paste my copy of it into my solutions :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where almost all the time of the algorithm is spent
I'm wondering if the auxiliary insertion buffer wouldn't help. Of course not by itself, since it would behave exactly the same (twopass + merge on delete/min), but what if the buffer has a maximum size, so it doesn't grow too big? 🤔
Simpler PointerPairingHeap: let the Timers object deal with knowing whether we added to HEAD or actually deleted a timer, etc. Fix potential stack-overflow when merging pairs after inserting _lots_ of timers (over 1 million fills a 8KB thread stack).
586496e
to
4a63e07
Compare
Replaces the
Deque
used in #14996 for a min Pairing Heap which is a kind of Mergeable Heap and is one of the best performing heap in practical tests when arbitrary deletions are required (think cancelling a timeout), otherwise a D-ary Heap (e.g. 4-heap) will usually perform better. See the A Nearly-Tight Analysis of Multipass Pairing Heaps paper or the Wikipedia page for more details.The implementation itself is based on the Pairing Heaps: Experiments and Analysis paper, and merely implements a recursive twopass algorithm (the auxiliary twopass might perform even better). The
Crystal::PointerPairingList(T)
type is generic and relies on intrusive nodes (the links are intoT
) to avoid extra allocations for the nodes (same asCrystal::PointerLinkedList(T)
). It also requires aT#heap_compare
method, so we can use the same type for a min or max heap, or to build a more complex comparison.Note: I also tried a 4-heap, and while it performs very well and only needs a flat array, the arbitrary deletion (e.g. cancelling timeout) needs a linear scan and its performance quickly plummets, even at low occupancy, and becomes painfully slow at higher occupancy (tens of microseconds on each delete, while the pairing heap does it in tens of nanoseconds).
Follow up to #14996
Builds on top of #15205, you should open the last commit for just the pairing heap.