-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't cache head/tail index in Consumer/Producer #48
Conversation
Thanks @zhenpingfeng! I've also seen slight improvements in the I'm not sure whether the results are reliable and I don't know which of the benchmarks are most relevant in practice. But since the improvements are small and the regressions are big, I'm hesitating to merge this. Have you also tried the single-threaded benchmarks? What kind of CPU have you used, if I may ask? I'm using a Intel(R) Core(TM) i5-7Y54 CPU. |
In the single-threaded test results, the default version does have a slight advantage in some tests. |
Sorry for the late response. The results are quite mixed. Most differences are within +-5%. The following plots show a few bigger differences. blue: this PR Some benchmarks improved quite a lot:
While other benchmarks regressed quite a lot:
Please note that The two-threads benchmarks (which I think might be the most relevant here), show only very small differences within the noise threshold:
In summary, the multi-threaded benchmarks show no difference, while the single-threaded ones show both big improvements and big regressions, which makes me think they are probably not very trustworthy. Unless there is a good explanation for the results, I'm hesitating to merge this, because it might actually be a net negative. Any theories that could explain the observations? Any further benchmark results? |
IMHO, |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
2573eb7
to
b0fbd44
Compare
Yes, you are absolutely right @RamType0. Thanks for noticing this! I've changed those operations to use I also added comments in case somebody is wondering later why |
e94a9d6
to
5bf04a4
Compare
5bf04a4
to
b6ad316
Compare
b6ad316
to
80c6842
Compare
I have decided to merge this. Some benchmarks are still inconclusive, but overall it tends to be an improvement. The two-thread benchmarks don't change. |
I'm not sure whether this is an improvement or not.
The benchmarks are inconclusive on my machine.
The results might be different on a CPU with weak memory ordering, because the number of atomic loads increases.
Any comments?