Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UR][CUDA][L0][HIP] Add virtual memory adapter implementations #939

Merged
merged 32 commits into from
Dec 18, 2023

Conversation

steffenlarsen
Copy link
Contributor

This commit adds the adapter implementations of the virtual memory extension functionality to be used in
intel/llvm#8954.

This commit adds the adapter implementations of the virtual memory
extension functionality to be used in
intel/llvm#8954.

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
@jandres742
Copy link

thanks @steffenlarsen !

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
@steffenlarsen steffenlarsen requested a review from a team as a code owner October 10, 2023 10:38
@JackAKirk
Copy link
Contributor

JackAKirk commented Oct 10, 2023

From my understanding this should follow the experimental feature guidelines from https://oneapi-src.github.io/unified-runtime/core/CONTRIB.html ?

Because e.g.

"Before making a contribution to the specification you should determine if the change should be made directly to the core specification or introduced as an experimental feature. The criteria we use to make this distinction are as follows:

The feature exists to enable an experimental feature in a parallel language runtime being built on top of Unified Runtime.

...
"

btw Do you know if HIP adapter can support this in the future?

@steffenlarsen
Copy link
Contributor Author

From my understanding this should follow the experimental feature guidelines from https://oneapi-src.github.io/unified-runtime/core/CONTRIB.html ?

My understanding was that this applied to UR interfaces. This does not add any new UR interfaces, just the implementations of some that were added a while ago. We could argue that it should be experimental, but that would mean changing the existing interfaces and I don't know if that is allowed?

btw Do you know if HIP adapter can support this in the future?

I believe https://docs.amd.com/projects/HIP/en/docs-5.4.1/doxygen/html/group___virtual.html describes the HIP interfaces. They should be similar to the CUDA ones.

@JackAKirk
Copy link
Contributor

From my understanding this should follow the experimental feature guidelines from https://oneapi-src.github.io/unified-runtime/core/CONTRIB.html ?

My understanding was that this applied to UR interfaces. This does not add any new UR interfaces, just the implementations of some that were added a while ago. We could argue that it should be experimental, but that would mean changing the existing interfaces and I don't know if that is allowed?

btw Do you know if HIP adapter can support this in the future?

I believe https://docs.amd.com/projects/HIP/en/docs-5.4.1/doxygen/html/group___virtual.html describes the HIP interfaces. They should be similar to the CUDA ones.

I see, makes sense, thanks.

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
@steffenlarsen steffenlarsen requested a review from a team as a code owner October 13, 2023 09:45
@steffenlarsen steffenlarsen requested review from a team as code owners October 20, 2023 13:26
@AllanZyne
Copy link
Contributor

AllanZyne commented Nov 16, 2023

Hi @steffenlarsen.

I think the DDI tables in source/adapters/level_zero/ur_interface_loader.cpp need to be updated too, like urGetVirtualMemProcAddrTable and urGetPhysicalMemProcAddrTable .

Thanks.

@pbalcer
Copy link
Contributor

pbalcer commented Nov 29, 2023

The GCC build on CUDA is now segfaulting: https://github.com/oneapi-src/unified-runtime/actions/runs/7033478414/job/19141529054?pr=939

If this is expected, you can add {{OPT} Segmentation fault at the beginning of the match file, and then {{OPT}} next to all other lines. This is not a great solution as such match file will then not fail once a test is fixed, but it will suppress the problem.

@kbenzie
Copy link
Contributor

kbenzie commented Nov 30, 2023

@steffenlarsen I can a have a look into what's going on with the CUDA failures today.

@kbenzie
Copy link
Contributor

kbenzie commented Nov 30, 2023

So for the first failing test running in Debug mode there's more informative output. I also added some printing of the size to allocate and the test is wanting to allocate almost the entirety of my 3060Ti's 8GB of VRAM. This is unreasonable, the test needs changed to allow allocation failure or not attempt such large allocations in the first place. Not looked into the other tests yet as this isn't the only failing one.

[ RUN      ] urPhysicalMemCreateTest.Success/NVIDIA_CUDA_BACKEND___NVIDIA_GeForce_RTX_3060_Ti___4000
size: 8388608000

UR CUDA ERROR:
        Value:           2
        Name:            CUDA_ERROR_OUT_OF_MEMORY
        Description:     out of memory
        Function:        urPhysicalMemCreate
        Source Location: /home/benie/Projects/oneapi-src/unified-runtime/source/adapters/cuda/physical_mem.cpp:29

/home/benie/Projects/oneapi-src/unified-runtime/test/conformance/virtual_memory/urPhysicalMemCreate.cpp:26: Failure
Expected equality of these values:
  uur::Result(UR_RESULT_SUCCESS)
    Which is: UR_RESULT_SUCCESS
  uur::Result(urPhysicalMemCreate(context, device, size, nullptr, &physical_mem))
    Which is: UR_RESULT_ERROR_UNKNOWN
[  FAILED  ] [  FAILED  ] urPhysicalMemCreateTest.Success/NVIDIA_CUDA_BACKEND___NVIDIA_GeForce_RTX_3060_Ti___4000, where GetParam() = (NVIDIA_GeForce_RTX_3060_Ti_, 4000) (54 ms)

Copy link
Contributor

@kbenzie kbenzie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are three bugs in the CTS, since you've not touched those files here's the diff to fix those. These fixes in addition to the error code special case below resulting in 100% pass rate of the virtual memory test suite on the CUDA adapter on my 3060Ti locally.

diff --git a/test/conformance/virtual_memory/urPhysicalMemCreate.cpp b/test/conformance/virtual_memory/urPhysicalMemCreate.cpp
index 078b0e68..e5124da1 100644
--- a/test/conformance/virtual_memory/urPhysicalMemCreate.cpp
+++ b/test/conformance/virtual_memory/urPhysicalMemCreate.cpp
@@ -16,8 +16,7 @@ struct urPhysicalMemCreateTest
     size_t size;
 };

-UUR_TEST_SUITE_P(urPhysicalMemCreateTest,
-                 ::testing::Values(1, 2, 3, 7, 12, 44, 1024, 4000, 12345),
+UUR_TEST_SUITE_P(urPhysicalMemCreateTest, ::testing::Values(1, 2, 3, 7, 12, 44),
                  uur::deviceTestWithParamPrinter<size_t>);

 TEST_P(urPhysicalMemCreateTest, Success) {
diff --git a/test/conformance/virtual_memory/urVirtualMemFree.cpp b/test/conformance/virtual_memory/urVirtualMemFree.cpp
index 6cb87955..3b4c4bab 100644
--- a/test/conformance/virtual_memory/urVirtualMemFree.cpp
+++ b/test/conformance/virtual_memory/urVirtualMemFree.cpp
@@ -19,5 +19,5 @@ TEST_P(urVirtualMemFreeTest, InvalidNullHandleContext) {

 TEST_P(urVirtualMemFreeTest, InvalidNullPointerStart) {
     ASSERT_EQ_RESULT(urVirtualMemFree(context, nullptr, size),
-                     UR_RESULT_ERROR_INVALID_NULL_HANDLE);
+                     UR_RESULT_ERROR_INVALID_NULL_POINTER);
 }
diff --git a/test/conformance/virtual_memory/urVirtualMemGranularityGetInfo.cpp b/test/conformance/virtual_memory/urVirtualMemGranularityGetInfo.cpp
index d4feccd6..df8c2879 100644
--- a/test/conformance/virtual_memory/urVirtualMemGranularityGetInfo.cpp
+++ b/test/conformance/virtual_memory/urVirtualMemGranularityGetInfo.cpp
@@ -60,7 +60,7 @@ TEST_P(urVirtualMemGranularityGetInfoNegativeTest, InvalidEnumeration) {
                          context, device,
                          UR_VIRTUAL_MEM_GRANULARITY_INFO_FORCE_UINT32, 0,
                          nullptr, &size),
-                     UR_RESULT_ERROR_INVALID_NULL_HANDLE);
+                     UR_RESULT_ERROR_INVALID_ENUMERATION);
 }

 TEST_P(urVirtualMemGranularityGetInfoNegativeTest,

source/adapters/cuda/physical_mem.cpp Outdated Show resolved Hide resolved
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
@steffenlarsen
Copy link
Contributor Author

Thank you, @kbenzie ! The suggestions have been applied. Hopefully CI agrees. 🤞

@pbalcer
Copy link
Contributor

pbalcer commented Nov 30, 2023

You need to update the match files again ;) cuda's should be empty, don't know how the test changes will after other adapters.

@codecov-commenter
Copy link

codecov-commenter commented Dec 1, 2023

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (dcec3fe) 15.73% compared to head (0563259) 15.74%.

Files Patch % Lines
...conformance/virtual_memory/urPhysicalMemCreate.cpp 0.00% 1 Missing ⚠️
...st/conformance/virtual_memory/urVirtualMemFree.cpp 0.00% 1 Missing ⚠️
.../virtual_memory/urVirtualMemGranularityGetInfo.cpp 0.00% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #939   +/-   ##
=======================================
  Coverage   15.73%   15.74%           
=======================================
  Files         223      223           
  Lines       31465    31465           
  Branches     3556     3556           
=======================================
+ Hits         4952     4953    +1     
  Misses      26462    26462           
+ Partials       51       50    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
@kbenzie kbenzie added the ready to merge Added to PR's which are ready to merge label Dec 4, 2023
@fabiomestre fabiomestre changed the base branch from adapters to main December 5, 2023 16:54
@fabiomestre
Copy link
Contributor

I have updated the target branch of this PR from the adapters branch to the main branch.
Development in UR is moving back to main. The adapters branch will soon be deleted.

@cdai2
Copy link

cdai2 commented Dec 5, 2023

Could you please help to merge this PR? It has "ready to merge" flag now.
We need to test and submit more PRs basing on this PR.

@kbenzie
Copy link
Contributor

kbenzie commented Dec 5, 2023

Could you please help to merge this PR? It has "ready to merge" flag now.
We need to test and submit more PRs basing on this PR.

Apologies for the slow rate of merging. However we do have a rather large backlog of PR's which are also gated on merging to intel/llvm which are bug fixes for the next release. We'll do our best to move this along as quickly as we can.

@kbenzie kbenzie removed the ready to merge Added to PR's which are ready to merge label Dec 14, 2023
@kbenzie
Copy link
Contributor

kbenzie commented Dec 14, 2023

Marking as draft until intel/llvm#8954 passes.

@kbenzie kbenzie marked this pull request as draft December 14, 2023 17:13
@aarongreig aarongreig marked this pull request as ready for review December 18, 2023 10:06
@aarongreig
Copy link
Contributor

Even though these changes shouldn't affect the adapter behaviour for SYCL without all the associated pi etc changes in intel/llvm#8954, I've made intel/llvm#12198 just so we have a commit on the intel llvm side to track the adapter changes back to if needed. I'll merge this PR once the testing there passes (should be soon, just waiting for the final job to get picked up by a runner)

@aarongreig aarongreig merged commit 8d1486a into oneapi-src:main Dec 18, 2023
51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.