Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NVPTX][AMD][New offload model] Add support for -fsycl-embed-ir in the new offloading model #14526

Merged

Conversation

asudarsa
Copy link
Contributor

When compiling for Nvidia/AMD devices and the user requested the IR to be embedded in the application (via option), We run the output of sycl-post-link (filetable referencing LLVM Bitcode + symbols) through the offload wrapper and link the resulting object to the application.

@asudarsa asudarsa requested review from a team as code owners July 11, 2024 05:05
@asudarsa
Copy link
Contributor Author

There will be merge issues with a parallel PR. #13806
Depending on which one gets merged, the other PR will need to be adjusted a bit.

Thanks

Copy link
Contributor

@sarnex sarnex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, minor comments!

@@ -0,0 +1,106 @@
// RUN: %{build} %{embed-ir} -O2 --offload-new-driver -o %t.out
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put this in the sycl/test-e2e/NewOffloadDriver folder instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the plan for these tests after the new offload driver becomes the default?

If the plan is to delete the tests using the new driver explicitly after the default behavior is changed, I agree with @sarnex that it would make sense to put them all in the same place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi Lukas and Nick

Thanks so much for feedback here.
I am going to remove the E2E test here. I have a new PR coming in very soon that adds a sequence of test for new driver and I will include this there.
I evaluated three options:

  1. Adding a new file in the same location as the original test
  2. Adding a new file in a separate folder
  3. Adding few lines in existing test to enable new offload testing

It seems to me that Option #3 will be the easiest to maintain when we make changes to the test in the future.
@sarnex and @sommerlukas WDYT?

Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO 2) is the easiest to maintain, they're all in the same place and won't bog down teams not working on tools

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sarnex . I thought about that. But, users making changes to the original test should remember to change the NewOffloadModel test as well. I thought that was a painpoint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel super strongly on this, so if the test owners are okay with it, it's fine with me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strong about the location, but I'd prefer if we had a test actually using the IR embedded by the flag.

For example, I think that none of the tests currently in this PR check whether the embedded IR has the correct target name/prefix as expected by the SYCL runtime.

With an e2e test, we would ensure that IR embedding is actually functional.

Option 3 would also be viable to achieve that.

Copy link
Contributor Author

@asudarsa asudarsa Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am adding the test back. Based on further 'internal thinking' and the fact that there is no 'strong' push for a particular option , I have decided to go with option 2. This will atleast avoid triaging headaches as @sarnex suggested.

Thanks again for discussion.

clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp Outdated Show resolved Hide resolved
clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp Outdated Show resolved Hide resolved
/// check for -sycl-embed-ir transmission to clang-linker-wrapper tool
// RUN: %clangxx -fsycl -### -fsycl-targets=nvptx64-nvidia-cuda \
// RUN: -fno-sycl-libspirv -nocudalib --offload-new-driver \
// RUN: -fsycl-embed-ir %s 2>&1 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if -fsycl-embed-ir is passed for devices other than NVidia or AMD?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to our implementation, it will be ignored. I can add a test for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that acceptable? Would a warning or error be appropriate here?

clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp Outdated Show resolved Hide resolved
@@ -0,0 +1,106 @@
// RUN: %{build} %{embed-ir} -O2 --offload-new-driver -o %t.out
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the plan for these tests after the new offload driver becomes the default?

If the plan is to delete the tests using the new driver explicitly after the default behavior is changed, I agree with @sarnex that it would make sense to put them all in the same place.

Copy link
Contributor

@mdtoguchi mdtoguchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me.

@asudarsa
Copy link
Contributor Author

Removed the 'ccc-print-phases' testing based on comment from @mdtoguchi

Thanks

…e new offloading model

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
…ired

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@asudarsa asudarsa force-pushed the add_sycl_embed_ir_support_new_offload branch from 9c76275 to e34ae78 Compare July 12, 2024 17:26
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Copy link
Contributor

@KseniyaTikhomirova KseniyaTikhomirova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@maksimsab
Copy link
Contributor

The following test is unrelated to the current PR since it fails in others PRs.
SYCL :: EnqueueNativeCommand/custom-command-multiple-dev-cuda.cpp

@maksimsab
Copy link
Contributor

@intel/llvm-gatekeepers Can we merge it?

@sommerlukas
Copy link
Contributor

Unrelated failure of CUDA CI is tracked in #14715.

@sommerlukas sommerlukas merged commit 676e851 into intel:sycl Jul 25, 2024
14 of 15 checks passed
hdelan pushed a commit to hdelan/llvm that referenced this pull request Jul 26, 2024
…e new offloading model (intel#14526)

When compiling for Nvidia/AMD devices and the user requested the IR to
be embedded in the application (via option), We run the output of
sycl-post-link (filetable referencing LLVM Bitcode + symbols) through
the offload wrapper and link the resulting object to the application.

---------

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Co-authored-by: Sabianin, Maksim <maksim.sabianin@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants