-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit vec alignment to 64 bytes #448
Limit vec alignment to 64 bytes #448
Conversation
We currently require `vec` to be aligned according to its total storage size. For example, a `vec<uint32_t, 8>` must be aligned on a 32 byte boundary (32 = sizeof(uint32_t) * 8). This comes from OpenCL, which has the same requirement. The largest required alignment is for `vec<double, 16>`, which is typically 128 bytes. However, this is problematic in SYCL because we want to support a variety of host compilers. The MSVC compiler has a limit of 64 byte alignment for function parameters. Therefore, the SYCL alignment requirements makes it impossible to pass a `vec<double, 16>` as a function parameter when compiling with MSVC. This is particularly problematic because many of the SYCL "builtin" math function take `vec` as a parameter type. We therefore propose weakening the alignment requirement to make it possible to support MSVC. Of course, implementations are still allowed to align `vec` more highly if this results in more efficient code. We considered removing the special alignment requirements entirely for `vec` and leaving this as an implementation detail. However, existing code may depend on this alignment, so we opted for the most conservative change that still allows MSVC to work. Therefore, we propose capping the required alignment to 64 bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a new interesting case of conflicting requirements between host and device.
It is not clear about what are the consequences for some hardware.
A program failing to run on a pedantic OpenCL device?
I have the feeling that it prevents using compilation flow with different compilers for host and device when they do not agree on this alignment constraint because it would not be possible to inject a compiler pass into MSVC to arbitrary increase the alignment by adding some padding elements in some structs for example.
A vendor is responsible for providing a complete SYCL implementation, including both compilers and backends. If the vendor uses a pedantic OpenCL implementation that really does require Another option is to implement SYCL
I do no think we ever expected that a vendor could use any arbitrary combination of host and device compiler to implement SYCL. A vendor is responsible for choosing (or configuring) the compilers to agree on certain things like the size of fundamental types, byte order, alignment requirements, etc.
Note that the proposed change here does not add a new alignment constraint. In fact, it does the opposite -- it weakens an alignment constraint that already existed in the SYCL spec. Perhaps your concern is not with the change proposed in this PR, but rather you are concerned with the original wording of the spec that guarantees any alignment of the |
I am still confused about how it can work with a pedantic OpenCL implementation. |
This is a good point. I think you're right, it's probably not possible to have a conformant implementation of SYCL on Windows that uses msvc as the host compiler and also uses a pedantic version of OpenCL that requires high alignment of The change proposed in this PR still gives implementors a choice, though. They can either use msvc and an OpenCL implementation that tolerates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the argumentation.
Please add non-normative note justifying this. |
Following up on this week's discussion on whether we should keep a changelog, this would be another good use-case IMO: including a quick note as to why a change was made, if it is not immediately obvious (like for a bug fix). |
Add a non-normative note explaining why the guaranteed alignment is limited to 64 bytes.
Done in 8d41edc |
Ready to merge. |
KhronosGroup/SYCL-Docs#448 limited sycl::vec alignment to 64 bytes
KhronosGroup/SYCL-Docs#448 limited sycl::vec alignment to 64 bytes
Apply sycl::vec 64 bytes limit. KhronosGroup/SYCL-Docs#448 limited sycl::vec alignment to 64 bytes
Limit vec alignment to 64 bytes
Limit vec alignment to 64 bytes (cherry picked from commit 472b3a2)
Limit vec alignment to 64 bytes (cherry picked from commit 472b3a2)
We currently require
vec
to be aligned according to its total storage size. For example, avec<uint32_t, 8>
must be aligned on a 32 byte boundary (32 = sizeof(uint32_t) * 8). This comes from OpenCL, which has the same requirement. The largest required alignment is forvec<double, 16>
, which is typically 128 bytes.However, this is problematic in SYCL because we want to support a variety of host compilers. The MSVC compiler has a limit of 64 byte alignment for function parameters. Therefore, the SYCL alignment requirements makes it impossible to pass a
vec<double, 16>
as a function parameter when compiling with MSVC. This is particularly problematic because many of the SYCL "builtin" math function takevec
as a parameter type.We therefore propose weakening the alignment requirement to make it possible to support MSVC. Of course, implementations are still allowed to align
vec
more highly if this results in more efficient code.We considered removing the special alignment requirements entirely for
vec
and leaving this as an implementation detail. However, existing code may depend on this alignment, so we opted for the most conservative change that still allows MSVC to work. Therefore, we propose capping the required alignment to 64 bytes.