-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CUDA] Dynamically load the CUPTI library when tracing #1070
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work/do anything when built out of sycl tree? As far as I can tell, we don't set XPTI_ENABLE_INSTRUMENTATION
anywhere, and the cuda adapter never links with xpti.
Might be useful to setup a CI job that builds cuda with tracing enabled.
…or XPTI tracing (#11866) This is a prerequisite for implementing dynamic loading of the CUPTI library when XPTI tracing is enabled. See oneapi-src/unified-runtime#1070
I haven't built UR out of the SYCL tree, but my understanding is that the CUDA tracing support is #ifdef'd out as you mention. I don't know if xpti should be a required dependency of UR, but this work is about making That's a good point, I will ask internally about adding CUDA tracing to a CI job. |
c953ee6
to
ae6905d
Compare
I've asked internally and the consensus is that tracing should be enabled in CI. This is better done in a separate PR due to the dependencies between this repo and https://github.com/intel/llvm (e.g. changing the signature for |
I have also added some changes to use |
Please create an issue for tracking this https://github.com/oneapi-src/unified-runtime/issues/new |
I have created #1098 for this. |
Created CI testing PR: intel/llvm#11952 |
I have updated the target branch of this PR from the |
ae6905d
to
c990788
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #1070 +/- ##
=======================================
Coverage 15.73% 15.73%
=======================================
Files 223 223
Lines 31466 31465 -1
Branches 3556 3556
=======================================
Hits 4952 4952
+ Misses 26463 26462 -1
Partials 51 51 ☔ View full report in Codecov by Sentry. |
Thanks @fabiomestre, I have rebased these changes on top of latest |
I wonder, what's the content of |
e8076c4
to
7afc5b8
Compare
On my system, this is the following path:
I agree that the mechanism to find the CUPTI library currently does not give any feedback when the library cannot be found and that might be confusing for the user. However, the previous situation where the CUPTI library cannot be found and this prevents loading the CUDA adapter is even worse as the DPC++ application cannot be run at all. The goal of this PR is to solve that particular situation, improving the loading process (e.g. with a user-configurable path to override the build-time path) and feedback (e.g. run-time warnings) is better done in a separate PR. |
If you could create a new issue to track this future work @pasaulais that would be much appriciated. |
|
Thanks! |
Regarding your point about 'just the library name' (relative path), I am curious as to why that would be less secure than explicitly linking against The reason I am asking is loading |
I've checked with ld.so man page, and it seems the only way to load a library from user-writable path is if that path is part of LD_LIBRARY_PATH, which is probably fine (?). But for Windows the first thing in the library search order is either CWD or binary directory, which can be user-writable and can potentially contain a malicious DLL. I remember that SYCL runtime stopped loading libraries with just the name for the same reason. Anyway, I'm not the expert on the topic and you should consult with security champions. Two alternative ways would be to either link cupti statically or to make a separate tracing library and to load it from the same location as the adapter library, which would eliminate a case of loading libraries with just the name. |
AFAIK, on Linux it's ok to let the loader find the appropriate library file, but Windows, as you say, we need to provide a fully qualified name for it to be safe. That's what we do in the UR loader. |
That makes sense, thanks for the clarification about why it's a security issue on Windows. One thing I'm not sure about linking cupti statically is version mismatches. Do you know if there is support for linking against one version of the CUDA toolkit and running on a system with a different toolkit version? With shared libraries, the cupti library will have the same version as the CUDA runtime library (even if a different version was used to build the CUDA adapter). |
@pasaulais my understanding is that it does not matter, whether you link statically or dynamically. NVIDIA guarantees backwards compatibility inside one major version. That is, if you link against CUDA 12.3, your build will fail on CUDA 12.1 machines no matter what, as the driver version is too low. But other direction should work just fine: 12.1 library should work in 12.3 environment. In fact, |
f79902a
to
f0ea491
Compare
This is something that could use revisiting in order to make a good decision for how to link these libraries, so I have created an issue for this: #1251 |
f0ea491
to
4b2ac71
Compare
CI testing for oneapi-src/unified-runtime#1070 --------- Co-authored-by: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
oneapi-src/unified-runtime#1070 and #11952 introduced a new variant of the `enableCUDATracing` function that takes a context pointer parameter, replacing the parameterless variant of that function. The older variant will be removed from UR once this PR is merged.
With these changes,
libcupti.so
is loaded dynamically when CUDA tracing is enabled. This enables XPTI tracing-enabled builds to work on systems that do not havelibcupti.so
or where that library cannot be located on the system.The
enableCUDATracing
anddisableCUDATracing
functions have been changed to take a context pointer, rather than use global variables for tracing state.There is a temporary
enableCUDATracing
variant with no parameter for compatibility until the relevant changes have been merged to https://github.com/intel/llvm.