Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HIP] Implement workaround for hipMemset2D #1395

Merged
merged 1 commit into from
Apr 12, 2024

Conversation

konradkusiak97
Copy link
Contributor

@konradkusiak97 konradkusiak97 commented Feb 28, 2024

There is an issue with hipMemset2D in ROCm prior to 6.0.0 version and this PR adds a workaround for it in commonMemSetLargePattern.

The issue appears only when using a pointer to host pinned memory from hipHostMalloc. I believe such a case hasn't been used until trying to refactor the USM fill (intel/llvm#12702).

Testing intel/llvm: intel/llvm#12898

@codecov-commenter
Copy link

codecov-commenter commented Feb 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 12.51%. Comparing base (78ef1ca) to head (19df225).
Report is 96 commits behind head on main.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1395      +/-   ##
==========================================
- Coverage   14.82%   12.51%   -2.32%     
==========================================
  Files         250      239      -11     
  Lines       36220    35949     -271     
  Branches     4094     4076      -18     
==========================================
- Hits         5369     4498     -871     
- Misses      30800    31447     +647     
+ Partials       51        4      -47     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@GeorgeWeb GeorgeWeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me. Would just prefer the lambda be a free function.

source/adapters/hip/enqueue.cpp Outdated Show resolved Hide resolved
source/adapters/hip/enqueue.cpp Outdated Show resolved Hide resolved
source/adapters/hip/enqueue.cpp Outdated Show resolved Hide resolved
source/adapters/hip/enqueue.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@MartinWehking MartinWehking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@konradkusiak97 konradkusiak97 added the ready to merge Added to PR's which are ready to merge label Mar 5, 2024
@kbenzie kbenzie added the hip HIP adapter specific issues label Apr 3, 2024
@kbenzie kbenzie merged commit 1473ed8 into oneapi-src:main Apr 12, 2024
51 checks passed
sarnex pushed a commit to intel/llvm that referenced this pull request Apr 12, 2024
sarnex pushed a commit to intel/llvm that referenced this pull request May 2, 2024
This PR changes the `queue.fill()` implementation to make use of the
native functions for a specific backend. It also unifies that
implementation with the one for memset, since it is just an 8-bit subset
operation of fill.

In the CUDA case, both memset and fill are currently calling
`urEnqueueUSMFill` which depending on the size of the filling pattern
calls either `cuMemsetD8Async`, `cuMemsetD16Async`, `cuMemsetD32Async`
or `commonMemSetLargePattern`. Before this patch memset was using the
same thing, just beforehand setting patternSize always to 1 byte which
resulted in calling `cuMemsetD8Async`. In other backends, the behaviour
is analogous.

The fill method was just invoking a `parallel_for` to fill the memory
with the pattern which was making this operation quite slow.

This PR depends on:
- oneapi-src/unified-runtime#1395
- oneapi-src/unified-runtime#1412
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hip HIP adapter specific issues ready to merge Added to PR's which are ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants