-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Add support for JIT-ing in AMD and NVIDIA backends #14280
Conversation
a5928b4
to
432ea9b
Compare
432ea9b
to
ac5a346
Compare
ac5a346
to
21e38f2
Compare
21e38f2
to
adfbf76
Compare
1408f59
to
ef2a74a
Compare
59078a8
to
04281f5
Compare
Also allow resetting of SYCL_CACHE_IN_MEM.
04281f5
to
434bc9f
Compare
I just remembered that I asked about tests. What is the plan for E2E or SYCL-level unittests? |
Testing of this is a bit tricky, but I though we could use the debug output of the specialization constant materializer pass as a proof of the JIT taking place (as the pass is only pass of the JIT pipeline). I have it almost ready, exercising both a regular kernel and kernel bundle. Going to add it to e2e. Will ping you once it's push to this PR. |
OK, @steffenlarsen let me know what you thing about this test, thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the test! Looks like a reasonable way to cover it.
// RUN: %{build} -fsycl-embed-ir -o %t.out | ||
// RUN: env SYCL_JIT_AMDGCN_PTX_KERNELS=1 env SYCL_JIT_COMPILER_DEBUG="sycl-spec-const-materializer" %{run} %t.out &> %t.txt ; FileCheck %s --input-file %t.txt | ||
|
||
#include <sycl/sycl.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sycl/sycl.hpp
is not permitted in E2E tests. You want sycl/detail/core.hpp
and then probably some other headers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@intel/llvm-gatekeepers I think we might be good to go on this one. |
@jchlanda - Sadly there seems to be more failures related to it. Could you please take a look? |
@jchlanda Do you mind confirming if you're working on the problem? If it's outside of working hours for you we might want to revert. Thanks |
I'm looking into this, I think Jakub is out of office now. |
@Naghasan Thanks! |
Seems it was the NDEBUG macro causing issue, this should fix #14777 |
@Naghasan There are also unused param warnings:
https://github.com/intel/llvm/actions/runs/10095439923/job/27915595323 |
This patch provides the following: * support for JIT compilation of Nvidia and AMD kernels This is guarded by `SYCL_JIT_KERNELS` environment variable. Target CPU and features can be provided through environment variables (`SYCL_JIT_TARGET_CPU` and `SYCL_JIT_TARGET_FEATURES` respectively), otherwise default, per-backend, values will be chosen. * caching of JIT-compiled kernels The runtime maintains a map of available JIT-ed kernels, accessible through a key, which is constructed from kernel's name and values of specialization constant (if provided). * materialization of specialization Materialization is done through a `SYCLSpecConstMaterializer` pass that receives the values of all specialization constants used by a kernel (from `SYCLSpecConstDataInserter`) and then walks all the uses of implicit kernel argument (`_arg__specialization_constants_buffer`), representing emulated specialization constants, with concrete values, turning them to de-facto compile time constants. This PR extends the work done for kernel fusion and in a similar fashion it requires embedding of IR (`-fsycl-embed-ir`) during the initial compilation.
This patch provides the following:
support for JIT compilation of Nvidia and AMD kernels
This is guarded by
SYCL_JIT_KERNELS
environment variable. Target CPU and features can be provided through environment variables (SYCL_JIT_TARGET_CPU
andSYCL_JIT_TARGET_FEATURES
respectively), otherwise default, per-backend, values will be chosen.caching of JIT-compiled kernels
The runtime maintains a map of available JIT-ed kernels, accessible through a key, which is constructed from kernel's name and values of specialization constant (if provided).
materialization of specialization
Materialization is done through a
SYCLSpecConstMaterializer
pass that receives the values of all specialization constants used by a kernel (fromSYCLSpecConstDataInserter
) and then walks all the uses of implicit kernel argument (_arg__specialization_constants_buffer
), representing emulated specialization constants, with concrete values, turning them to de-facto compile time constants.This PR extends the work done for kernel fusion and in a similar fashion it requires embedding of IR (
-fsycl-embed-ir
) during the initial compilation.