-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bugfix, enhancement] enable proper GPU offloading with fp64 support when DPCtl unavailable #2152
base: main
Are you sure you want to change the base?
Conversation
/intelci: run |
.def(py::init([](const py::int_& obj) { | ||
return get_queue_by_pylong_pointer(obj); | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please share the case when it is needed? Does it covered by tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/pytorch/pytorch/blob/main/torch/csrc/xpu/Stream.cpp#L72, waiting on a compiler compatibility fix here: pytorch/pytorch#139775
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment somewhere that this is needed to accept pytorch tensors.
I think the better place for the comment is along with the function's definition. But it's up to you to decide about the comment's placement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've now left a note @Vika-F please let me know what you think!
@@ -102,6 +106,7 @@ namespace oneapi::dal::python { | |||
#else | |||
#ifdef ONEDAL_DATA_PARALLEL | |||
PYBIND11_MODULE(_onedal_py_dpc, m) { | |||
init_sycl(m); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not this be sycl_interfaces
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samir-nasibli would I need to rename sycl.cpp to sycl_interfaces.cpp, and if so, do you want me to move the contents of sycl.cpp to sycl_interfaces.cpp?
Intel Max GPUs support fp64 computation. The dummy sycl queue assumes all gpus cannot compute using doubles. The dummy sycl queue is used when DPCtl is not installed, which is the case in testing sklearn conformance with GPU. This down conversion from double to float is causing results to be less precise and leading to many of the deselected gpu tests. I think there will be a follow-up PR which will differentiate GPU deselections based on hardware characteristics, at the moment I will leave all deselections. |
Co-authored-by: Andreas Huber <9201869+ahuber21@users.noreply.github.com>
#ifdef ONEDAL_DATA_PARALLEL | ||
|
||
void instantiate_sycl_interfaces(py::module& m){ | ||
py::class_<sycl::queue> syclqueue(m, "SyclQueue"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please write a comment about the purpose of this class. I.e. that it implements sycl queue interface in case dpctl's sycl queue is not available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, let me know what you think
NOTE: now that #2160 is nearly merged, I would like that PR to be merged into main and then this PR rebased so that full testing is done. |
…t-learn-intelex into dev/dummysyclqueue
/intelci: run |
It seems that the previously implemented get_device_id isn't conformant with DPCtl, and checks on filter_string will be commented out in this PR for comparisons. A follow-up ticket will be made for correcting this, ideally when we do a DLPACK conformance rollout. |
/intelci: run |
Description
This corrects a circular import issue with
onedal.utils.validation
when trying to create interfaces to the onedal backend in that file, the_device_offload
imports from validation, and trying to use policy in the validation file will make a loop. By moving the check to C++, it creates a better interface, and removes the need for the_device_offload
import in_policy.py
entirely.PR should start as a draft, then move to ready for review state after CI is passed and all applicable checkboxes are closed.
This approach ensures that reviewers don't spend extra time asking for regular requirements.
You can remove a checkbox as not applicable only if it doesn't relate to this PR in any way.
For example, PR with docs update doesn't require checkboxes for performance while PR with any change in actual code should have checkboxes and justify how this code change is expected to affect performance (or justification should be self-evident).
Checklist to comply with before moving PR from draft:
PR completeness and readability
Testing
Performance