-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement MCS Combining lock #666
Conversation
d1f5f53
to
ffa8fcb
Compare
I attempted to integrate |
This is hybrid of Flat Combining and the MCS queue lock. It uses the queue like the MCS queue lock, but each item additionally contains a thunk to perform the body of the lock. This enables other threads to perform the work than initially issued the request.
This update adds a fast path flag for the uncontended case. This reduces the number of atomic operations in the uncontended case.
src/snmalloc/ds/combininglock.h
Outdated
// queue. | ||
auto curr_c = curr; | ||
if (lock.head.compare_exchange_strong( | ||
curr_c, nullptr, std::memory_order_acq_rel)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The failure order should be Relaxed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks
void attach(CombiningLock& lock) | ||
{ | ||
// Test if no one is waiting | ||
if (lock.head.load(std::memory_order_relaxed) == nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is beneficial to separate this region out as fast path subject to be inlined. The remaining parts can be put into a attach_slow
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the this code is used here enough to warrant that, but maybe I should for future use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thought for the future, and one bit of humor, but this LGTM. Took me a moment to understand again how the MCS lock's queue worked; one of these days it'll stick.
@@ -133,4 +133,11 @@ namespace snmalloc | |||
lock.flag.store(false, std::memory_order_release); | |||
} | |||
}; | |||
|
|||
template <typename F> | |||
inline void with(FlagWord& lock, F&& f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not when
? ;)
* might sort a collection of inserts, and perform them in a single traversal. | ||
* | ||
* Note that, we should perhaps add a Futex/WakeOnAddress mode to improve | ||
* performance in the contended case, rather than spinning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could perhaps templatize the class on <bool notify = FALSE>
or such?
Another interesting synchronization technique using on stack nodes is the HemLock from oracle. https://arxiv.org/pdf/2102.03863 |
Thanks for the link. I hadn't seen that algorithm. |
This is hybrid of Flat Combining and the MCS queue lock. It uses
the queue like the MCS queue lock, but each item additionally
contains a thunk to perform the body of the lock. This enables
other threads to perform the work that initially issued the request.