-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL]Fix SYCL target triple check for NVidia targets. #14505
Conversation
static bool isValidSYCLTriple(llvm::Triple TargetTriple) { | ||
// Check for invalid SYCL device triple values for NVidia GPU targets. | ||
// nvptx64-nvidia-cuda is the valid triple for NVidia targets. | ||
if (TargetTriple.getArchTypeName(TargetTriple.getArch()) == "nvptx64" && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (TargetTriple.getArchTypeName(TargetTriple.getArch()) == "nvptx64" && | |
if (TargetTriple.isNVPTX() && |
Why do we need to do string comparisons?
// Check for invalid SYCL device triple values for NVidia GPU targets. | ||
// nvptx64-nvidia-cuda is the valid triple for NVidia targets. | ||
if (TargetTriple.getArchTypeName(TargetTriple.getArch()) == "nvptx64" && | ||
TargetTriple.getVendorTypeName(TargetTriple.getVendor()) == "nvidia" && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once again, I would prefer to avoid string comparison. Target.getVendor() == Triple::VendorType::NVIDIA
should be enough
.StartsWith("rtems", Triple::RTEMS) | ||
.StartsWith("nacl", Triple::NaCl) | ||
.StartsWith("aix", Triple::AIX) | ||
.Case("cuda", Triple::CUDA) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hesitant to accept this change at intel/llvm. If you can get this accepted to the upstream LLVM, then it is fine.
This change is at least inconsistent with handling of other OSes. For example, clang
will still accept nvptx-nvidia-nvclmyown
as a valid triple for OpenCL on NVIDIA, even though the spelling contains some extra myown
.
My main concern is that no one prohibits using other things from intel/llvm
besides SYCL (and I know for sure that folks are using OpenMP for example) and therefore this change could break someone's workflow that we don't know about. Since we don't mark our customizations to the upstream codebase in any way, this change could easily get lost, cause conflicts or be hard to find during debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, when we pass an invalid target triple value to fsycl-targets
for NVidia targets , we get the following error:
fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.implicit.offset
llvm-foreach:
clang++: error: clang frontend command failed with exit code 70 (use -v to see invocation)
This error is non-intuitive and does not clearly describe what the problem is.
Currently, in the driver code base, there is no check to verify if the target triple string passed matches the exact value documented - nvptx64-nvidia-cuda
We only check if the ‘architecture’ component has ‘nvptx64’ and return that as a valid string.
Example: nvptx64-badVendor-badOS would still be passed as a valid triple to the toolchain and we end up getting an error.
For OpenMP, they do allow partial/shortened triple strings with empty Vendor or/and OS components. The following triples are all valid. Please see this function
nvptx64
nvptx64-nvidia
nvptx64--cuda
As for this change: .Case("cuda", Triple::CUDA)
When we just check using StartsWith
, for an example like this - nvptx64-nvidia-cuda123,
the 123 get parsed as the OS Major version but the triple still gets rejected because cuda123 is not a valid OS component for offloading SYCL to nvptx backend.
I can remove the string comparisons as parseArch() , parseVendor() and parseOS() all use .Case match for Nvidia targets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the best we can do is check for Triple::CUDA
for the OS and not try to be strict about it. Sure we would be allowing OS values of cudaBadString, but given how doing so is consistent with other OS settings I'm hard pressed to deviate. If the failure occurs down the line when using cudaBadString, there may be an issue later in the compilation when using the triple that isn't using the known OS check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error is non-intuitive and does not clearly describe what the problem is.
I agree, but I still don't think that the suggested customization is right in context of deviation from the community code, further maintenance effort and potential unexpected/unwanted side effects.
For OpenMP, they do allow partial/shortened triple strings with empty Vendor or/and OS components.
We should adopt this in sycl
and it will automatically reduce the amount of uses of longer triple that is prone to errors. Moreover, -fsycl-targets
will eventually move to --arch
that accepts a target name and not a triple, resolving this issue all together.
-fsycl-targets
enables AOT compilations for specified device targets.For NVidia GPU targets, per oneAPI DPCPP documentation, the valid target triple string is
nvptx64-nvidia-cuda
This PR, checks if the target triple string provided for NVidia targets is a valid one.
Example: