-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][COMPAT] Memset API updated to support 2-byte and 4-byte memsets #11340
Conversation
Alcpz
commented
Sep 28, 2023
- The memory interface has changed:
- fill has been removed
- memset now offers a templated version (which works as fill did)
- memset_d8, memset_d16, and memset_d32 were added to allow memsetting 1, 2 and 4 byte-sized values.
Working through this PR, I see that we have previously been somewhat inconsistent about |
@konradkusiak97 is this a duplication of your fill work? Maybe there should just be a single interface for this functionality? |
After #12702 is merged, the fill command will use backend-specific calls like |
Yeah that is what I was thinking. That would mean that sycl spec already supports this functionality in theory and you have now implemented it in practice. Therefore this compat PR doesn't seem to be a good idea? Why not just translate cuda code to the intended part of the sycl specification? |
In this case, SYCLcompat wraps over the fill functionality @konradkusiak97 has been working on (in the specific case of the CUDA backend). While I agree that we should translate that way directly, this serves both as an example of how to translate, and is needed for applications that have been already been translated. |
Closed. Replaced by #13409 |
…#13409) This PR replaces #11340 This PR extends the memory header to include 2 byte and 4 byte memsets. - memset remains unchanged. - 2D / 3D memsets are templated and wrap `sycl::fill`. Functionality remains unchanged as it is exposed through `detail::memset<unsigned char>`, equivalent to what we had before. - memset_d16 and memset_d32 calls are added wrapped around `sycl::fill` using 2-byte and 4-byte datatypes Added tests for memset_d16 and memset_d32.