-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cl_ext_buffer_device_address #1159
base: main
Are you sure you want to change the base?
Conversation
through the execution information. | ||
|
||
Non-argument device pointers accessed by the kernel must be specified | ||
by passing pointers to those buffers via {clSetKernelExecInfo}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could discuss whether we want to require applications to always pass in the full list of buffers, because that would allow drivers to not having to keep track of bda
allocations. But that also depends on if "this memory allocation is used in a dispatch" being relevant for the most/some OpenCL implementations in the first place.
But one can also argue that it should be convenient for users to use this extension, so they don't have to keep track themselves, but that makes the hot path (launching many kernels) more expensive as at launch time the latest you need to map back from pointer to buffer.
But I think it's also fine to keep it like this, because it's closer to how it's done for SVM.
any motivation to get this merged? Or anything else needed to discuss before merging this? Could also try to bring it up at the CL WG if needed. |
Yep, this is still being discussed in the WG. I personally think it's useful as is and shouldn't harm anything if merged as it even has 2 implementations now. |
Thanks @SunSerega |
Alright, and now the problem I found in #1171 is visible here because the |
include::{generated}/api/version-notes/CL_MEM_DEVICE_ADDRESS_EXT.asciidoc[] | ||
| This flag specifies that the buffer must have a single fixed address | ||
for its lifetime and the address should be unique at least across the devices | ||
of the context, but not necessarily withing the host (virtual) memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think support for this flag needs to be optional, as otherwise this extension will be difficult to implement on current Vulkan implementations.
With Vulkan you can use bda (optional in 1.2, mandatory in 1.3) to have a fixed address for a memory allocation, but a CL runtime wouldn't be able to guarantee that allocations across devices to get the same address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe it's better if it's optional for multi device contexts, so applications and libraries could support the same code on single device contexts as well.
* {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources | ||
required by the OpenCL implementation on the host. | ||
|
||
Add a new flag to clSetKernelExecInfo for setting indirect device pointer access info <<clSetKernelExecInfo, List of supported param_name stable>>: |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Yes, this was the idea. I'll add a mention in the next update. |
| {cl_mem_device_address_pair_EXT_TYPE} | ||
| Returns the device-address pairs for all devices in the context. | ||
The per-device addresses might differ when the buffer was allocated | ||
with the CL_MEM_DEVICE_PRIVATE_EXT enabled. |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
Updated according to @karolherbst comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've already implemented the extension fully in rusticl/mesa (including sharing the same address across devices) and I think it's fine, however I'd still urge to address the concerns I have for layered implementations implementing it on top of Vulkan. I've already considered the constraints when implementing it, however I think it's better to provide clients to query if address sharing across multiple devices is supported or not.
| {CL_MEM_DEVICE_PRIVATE_EXT_anchor} | ||
|
||
include::{generated}/api/version-notes/CL_MEM_DEVICE_PRIVATE_EXT.asciidoc[] | ||
| If this flag is combined with CL_MEM_DEVICE_ADDRESS_EXT, each device in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes more sense to remove the combined with CL_MEM_DEVICE_ADDRESS_EXT
part, then those two flags are independent.
Then a CL_DEVICE_SUPPORTS_MULTI_DEVICE_ADDRESS
query could be added to clGetDeviceInfo
to indicate if CL_MEM_DEVICE_ADDRESS_EXT
is supported when the device is used in a context with other devices.
This should be enough to make this extension supportable on vulkan as mentioned earlier.
The basic cl_mem buffer API doesn't enable access to the underlying raw pointers in the device memory, preventing its use in host side data structures that need pointer references to objects. This API adds a minimal increment on top of cl_mem that provides such capabilities. The version 0.1.0 is implemented in PoCL and rusticl for prototyping, but everything's still up for discussion. chipStar is the first client that uses the API.
Co-authored-by: Sun Serega <sunserega2@gmail.com>
Changed the CL_MEM_DEVICE_ADDRESS_EXT wording for multi-device cases "all", not "any", covering a case where not all devices can ensure the same address across the context. In that case CL_INVALID_VALUE can be returned. Defined sub-buffer address computation to be 'base_addr + origin'. Added error conditions for clSetKernelExecInfo when the device doesn't support device pointers.
...and renamed them to CL_MEM_DEVICE_SHARED_ADDRESS_EXT and CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT. The first one guarantees the same address across all devices in the context, whereas the latter allows per-device addresses.
b8df46b
to
1931416
Compare
The basic cl_mem buffer API doesn't enable access to the underlying raw pointers in the device memory, preventing its use in host side data structures that need pointer references to objects. This API adds a minimal increment on top of cl_mem that provides such capabilities.
The version 0.1.0 is implemented in PoCL and rusticl for prototyping, but everything's still up for discussion. chipStar is the first client that uses the API.