Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[umf] optimize bucket selection for allocation sizes #994

Closed
wants to merge 1 commit into from

Conversation

DamianDuy
Copy link

@DamianDuy DamianDuy commented Oct 25, 2023

This PR changes linear search for buckets to calculating the bucket index based on calculating the power of two (or halfway power of two) larger than Size parameter by using bitwise operations.

Solves: #871

@pbalcer
Copy link
Contributor

pbalcer commented Oct 25, 2023

This affects SYCL, so please follow the process outlined in the contributing guide.

Copy link
Contributor

@pbalcer pbalcer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, can you please do a micro-benchmark to see which implementation is better? My initial thinking was something like 0f296e7 - that's the common approach to this problem.

@@ -902,6 +904,10 @@ umf_result_t DisjointPool::initialize(umf_memory_provider_handle_t *providers,
!((parameters.MinBucketSize & (parameters.MinBucketSize - 1)) == 0)) {
return UMF_RESULT_ERROR_INVALID_ARGUMENT;
}
// MinBucketSize parameter can't be larger than CutOff.
if (parameters.MinBucketSize > CutOff) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is CutOff exposed somehow to the user? If not, how will user know what is the max size he can set?
If this is not exposed, I think you should just set MinBucketSize to CutOff if it's bigger.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@igchor
Copy link
Member

igchor commented Oct 25, 2023

hm, can you please do a micro-benchmark to see which implementation is better? My initial thinking was something like 0f296e7 - that's the common approach to this problem.

You could also try moving the size outside of the bucket class (create a second array/vector of the same size as the current Buckets that would only keep size) and do a binary search on that. This way you would have much fewer cache misses.

@DamianDuy DamianDuy force-pushed the optimizeFindBucket branch 2 times, most recently from a9b326f to 5ab7dc0 Compare November 15, 2023 12:46
@codecov-commenter
Copy link

codecov-commenter commented Nov 15, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (5c0557e) 16.19% compared to head (7602d17) 16.26%.
Report is 4 commits behind head on main.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #994      +/-   ##
==========================================
+ Coverage   16.19%   16.26%   +0.07%     
==========================================
  Files         220      221       +1     
  Lines       30792    30818      +26     
  Branches     3481     3487       +6     
==========================================
+ Hits         4986     5012      +26     
  Misses      25755    25755              
  Partials       51       51              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@DamianDuy DamianDuy changed the base branch from main to adapters November 15, 2023 13:03
@DamianDuy DamianDuy requested review from a team as code owners November 15, 2023 13:03
@DamianDuy DamianDuy changed the base branch from adapters to main November 15, 2023 13:05
@DamianDuy DamianDuy force-pushed the optimizeFindBucket branch 2 times, most recently from c8b6be2 to e0c887a Compare November 15, 2023 13:40
@DamianDuy
Copy link
Author

After microbenchmarking I changed the implementation from binary search (which was much slower) to the approach with calculating index using bitwise operations.


assert((It != Buckets.end()) && "Bucket should always exist");
if (Size == lower_bound) {
Copy link
Member

@igchor igchor Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler might already do this but you could avoid the branches entirely by implementing it like this:

auto isPowerOf2 = 0 == (Size & (Size - 1));
auto largerThanHalfwayBetweenPowersOf2 = !isPowerOf2 && bool((Size - 1) && (1 << (position - 1)));
auto index = (position - MinBucketSizeExp) * 2 + !isPowerOf2 + largerThanHalfwayBetweenPowersOf2;
return index;

(if you decide to use this code, you should check if it's actually correct ;d)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, it doesn't work correctly with halfway values and personally I find if branches more readable. Also wouldn't this code branch anyway because of a usage of the && operator?

@DamianDuy
Copy link
Author

I am closing this PR because there is a corresponding one in the unified-memory-framework repositorium:
oneapi-src/unified-memory-framework#37

@DamianDuy DamianDuy closed this Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants