Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can an implementation diverge on device info query and compiler feature macro for atomic_scope_all_devices #1129

Closed
bcalidas opened this issue Apr 1, 2024 · 5 comments · Fixed by #1171

Comments

@bcalidas
Copy link

bcalidas commented Apr 1, 2024

This issue is related to #1047 . We have a specific question.

Can an implementation not report CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES for the clGetDeviceInfo query CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES but still define the compiler feature macro - __opencl_c_atomic_scope_all_devices

In this case the implementation does report __opencl_c_atomic_scope_all_devices for the clGetDeviceInfo query: CL_DEVICE_OPENCL_C_FEATURES

From the spec - "When used on a fine-grained non-atomic SVM buffer, a coarse-grained
SVM buffer, or a non-SVM buffer, operations parameterized with memory_scope_all_svm_devices
will behave as if they were parameterized with memory_scope_device"

This would imply that it is ok for an implementation to report __opencl_c_atomic_scope_all_devices but not CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES since the behavior of kernels on such an implementation is defined in the spec.

There are some conformance tests which check for consistency between feature macro and device info queries. These will need to be adapted pending the outcome of this discussion. It does bring up the larger question of when it is appropriate to expect these queries to match.

An additional consideration is how the compiler options -cl-std=CL3.0 and -cl-std=CL2.0 should affect compiler behavior in this case,

@bashbaug
Copy link
Contributor

bashbaug commented Apr 1, 2024

In the scenario above what are the SVM capabilities for this device? Specifically, does it support fine-grain SVM with atomics?

@bcalidas
Copy link
Author

bcalidas commented Apr 1, 2024

The device supports coarse grain SVM only.

@bcalidas
Copy link
Author

bcalidas commented Apr 9, 2024

We pass with the following configuration

For clGetDeviceInfo

  1. Report CL_DEVICE_SVM_COARSE_GRAIN_BUFFER under DEVICE_SVM_CAPABILITIES
  2. Report CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES under CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES
  3. Report __opencl_c_atomic_scope_all_devices under CL_DEVICE_OPENCL_C_FEATURES

With this combination, the meaning of __opencl_c_atomic_scope_all_devices is that the compiler supports the feature but that the behavior of the kernel could be different depending on the runtime feature support. If this is ok as per the intent of the spec, we will proceed with this solution. We could expand the underlying concept to #1047 as well.

It will be good to put up a spec PR against #1047 and confirm that the updated text reads well and aligns with existing implementations.

@bashbaug
Copy link
Contributor

bashbaug commented Apr 9, 2024

We pass with the following configuration [...]

This matches what we report for our coarse-grain SVM GPUs also, see e.g. https://opencl.gpuinfo.org/displayreport.php?id=2215.

@lakshmih
Copy link
Contributor

Text for __opencl_c_atomic_scope_all_devices already accounts for fallback behavior when fine grained SVM is not supported. We should update the description for the runtime queries (CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES and CL_DEVICE_ATOMIC_FENCE_CAPABILITIES) to have the same behavior, specifically that when used on a fine-grained non-atomic SVM buffer, a coarse-grained SVM buffer, or a non-SVM buffer CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES would behave like CL_DEVICE_ATOMIC_SCOPE_DEVICE

lakshmih added a commit to lakshmih/OpenCL-Docs that referenced this issue May 21, 2024
…1129

Clarify behavior of above capability value on devices that
don't support fine grained SVM
bashbaug added a commit that referenced this issue Sep 12, 2024
* platform: Clarify behavior for ATOMIC_SCOPE_ALL_DEVICES

* Adjusted table widths to prevent overflow beyond a page

Asciidr has a limitation that prevents table cells from
spanning across pages

* Update api/opencl_platform_layer.asciidoc

Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>

---------

Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

3 participants