-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can an implementation diverge on device info query and compiler feature macro for atomic_scope_all_devices #1129
Comments
In the scenario above what are the SVM capabilities for this device? Specifically, does it support fine-grain SVM with atomics? |
The device supports coarse grain SVM only. |
We pass with the following configuration For clGetDeviceInfo
With this combination, the meaning of __opencl_c_atomic_scope_all_devices is that the compiler supports the feature but that the behavior of the kernel could be different depending on the runtime feature support. If this is ok as per the intent of the spec, we will proceed with this solution. We could expand the underlying concept to #1047 as well. It will be good to put up a spec PR against #1047 and confirm that the updated text reads well and aligns with existing implementations. |
This matches what we report for our coarse-grain SVM GPUs also, see e.g. https://opencl.gpuinfo.org/displayreport.php?id=2215. |
Text for __opencl_c_atomic_scope_all_devices already accounts for fallback behavior when fine grained SVM is not supported. We should update the description for the runtime queries (CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES and CL_DEVICE_ATOMIC_FENCE_CAPABILITIES) to have the same behavior, specifically that when used on a fine-grained non-atomic SVM buffer, a coarse-grained SVM buffer, or a non-SVM buffer CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES would behave like CL_DEVICE_ATOMIC_SCOPE_DEVICE |
…1129 Clarify behavior of above capability value on devices that don't support fine grained SVM
* platform: Clarify behavior for ATOMIC_SCOPE_ALL_DEVICES * Adjusted table widths to prevent overflow beyond a page Asciidr has a limitation that prevents table cells from spanning across pages * Update api/opencl_platform_layer.asciidoc Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com> --------- Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>
This issue is related to #1047 . We have a specific question.
Can an implementation not report CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES for the clGetDeviceInfo query CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES but still define the compiler feature macro - __opencl_c_atomic_scope_all_devices
In this case the implementation does report __opencl_c_atomic_scope_all_devices for the clGetDeviceInfo query: CL_DEVICE_OPENCL_C_FEATURES
From the spec - "When used on a fine-grained non-atomic SVM buffer, a coarse-grained
SVM buffer, or a non-SVM buffer, operations parameterized with memory_scope_all_svm_devices
will behave as if they were parameterized with memory_scope_device"
This would imply that it is ok for an implementation to report __opencl_c_atomic_scope_all_devices but not CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES since the behavior of kernels on such an implementation is defined in the spec.
There are some conformance tests which check for consistency between feature macro and device info queries. These will need to be adapted pending the outcome of this discussion. It does bring up the larger question of when it is appropriate to expect these queries to match.
An additional consideration is how the compiler options -cl-std=CL3.0 and -cl-std=CL2.0 should affect compiler behavior in this case,
The text was updated successfully, but these errors were encountered: