Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Add force range rounding option and introduce new compiler flag #12715

Merged
merged 8 commits into from
Mar 20, 2024

Conversation

hdelan
Copy link
Contributor

@hdelan hdelan commented Feb 14, 2024

Adds a new preference for range rounding, force, such that if the compile flag is used, only the range rounded parallel_for kernel will be generated. This can make binaries smaller as there is no duplication of SYCL range kernels across range rounded and unrounded versions.

I have also added the flag: -fsycl-range-rounding, which can have values: on, force or disable. This flag aims to supercede the fsycl-disable-range-rounding flag.

I have also added to existing tests to check for the functionality of the new flag and refactored the range rounding sycl-e2e test. Also added brief description of flag's behaviour in doc

clang/include/clang/Basic/LangOptions.h Outdated Show resolved Hide resolved
clang/include/clang/Basic/LangOptions.def Outdated Show resolved Hide resolved
clang/lib/Sema/SemaSYCL.cpp Show resolved Hide resolved
sycl/include/sycl/handler.hpp Outdated Show resolved Hide resolved
sycl/test-e2e/Basic/parallel_for_range_roundup.cpp Outdated Show resolved Hide resolved
sycl/test-e2e/Basic/parallel_for_range_roundup.cpp Outdated Show resolved Hide resolved
clang/include/clang/Basic/LangOptions.def Show resolved Hide resolved
sycl/test-e2e/Basic/parallel_for_range_roundup.cpp Outdated Show resolved Hide resolved
@hdelan
Copy link
Contributor Author

hdelan commented Feb 19, 2024

CUDA should fail because of additional checks in parallel_for_range_roundup.cpp which exposes this bug oneapi-src/unified-runtime#1326 . So this PR depends on that UR PR

@hdelan hdelan closed this Mar 20, 2024
@hdelan hdelan reopened this Mar 20, 2024
This commit adds a new preference for range rounding, force, such that if the
compile flag is used, only the range rounded parallel_for kernel will be generated.
This can make binaries smaller as there is no duplication of SYCL range kernels
across range rounded and unrounded versions.

I have also added the flag: -fsycl-range-rounding, which can have values: on, force
or disable. This flag aims to supercede the fsycl-disable-range-rounding flag.

I have also added to existing tests to check for the functionality of the new flag
and refactored the range rounding sycl-e2e test.
Add some description of how the -fsycl-range-rounding flag should be used.
Compiler invocation was missing -Xclang
- Change if else to switch in integration header emission and init preprocessor
- Change comment in handler.hpp
- Change comments and use static_assert with message in test-e2e
- Change enum to have no defined int values
- Wrap long line in LangOptions.def
#
Makes -fsycl-range-rounding= accessible to driver calls, not just cc1 invocations.
Range rounding disable is tested in another test.
Range rounding is disabled for -O0 but now a user preference for range rounding can
override this.
Make sure that if -fsycl-range-rounding=force is used, there is no
emission of the unrounded range kernel at -O0 and -Od
@hdelan
Copy link
Contributor Author

hdelan commented Mar 20, 2024

Ping @mdtoguchi for review, thanks

Copy link
Contributor

@ldrumm ldrumm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Looks good!

Copy link
Contributor

@mdtoguchi mdtoguchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@smanna12 smanna12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FE Changes LGTM

@hdelan
Copy link
Contributor Author

hdelan commented Mar 20, 2024

Ping @intel/llvm-gatekeepers this can be merged

@ldrumm ldrumm merged commit 389e2e5 into intel:sycl Mar 20, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants