Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bugfix, enhancement] enable proper GPU offloading with fp64 support when DPCtl unavailable #2152

Open
wants to merge 73 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 56 commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
a1c9df5
carryover from #2126
icfaust Nov 5, 2024
7a18a1d
Merge branch 'intel:main' into dev/dummysyclqueue
icfaust Nov 7, 2024
92f8c03
looking ahead
icfaust Nov 7, 2024
8b08af0
attempt to get it to compile
icfaust Nov 7, 2024
0941eb2
forgotten :
icfaust Nov 7, 2024
3aaa59c
attempts at fixes
icfaust Nov 7, 2024
ca25fdc
maybe?
icfaust Nov 7, 2024
36fc5a2
modify properties
icfaust Nov 7, 2024
f20252f
modify properties
icfaust Nov 7, 2024
e5284db
maybe?
icfaust Nov 7, 2024
085c170
another change
icfaust Nov 7, 2024
d47b05f
cleanup
icfaust Nov 7, 2024
56ec2a6
Update policy.cpp
icfaust Nov 7, 2024
350c5d7
add necessary features
icfaust Nov 8, 2024
0c03531
attempt to fix compiling issues
icfaust Nov 8, 2024
58202f4
attempt to fix compiling issues
icfaust Nov 8, 2024
1ab22fd
try again
icfaust Nov 8, 2024
70dd06f
try to deal with sycl queue pointers
icfaust Nov 8, 2024
f18abc9
missing value
icfaust Nov 8, 2024
c21af6d
try again
icfaust Nov 8, 2024
4278de3
extract DummySyclQueue
icfaust Nov 8, 2024
e4129d5
temporary solution to this issue...
icfaust Nov 8, 2024
2f4c476
changes just to test operation
icfaust Nov 8, 2024
bb04233
Update _device_offload.py
icfaust Nov 8, 2024
02ae4ed
move to a central sycl storage
icfaust Nov 8, 2024
fd35cc9
last change
icfaust Nov 8, 2024
6f5e4a2
fixes
icfaust Nov 8, 2024
1613b07
add end
icfaust Nov 8, 2024
381114b
move to beginning
icfaust Nov 8, 2024
7b56d73
readd alias
icfaust Nov 8, 2024
eace58c
missing definition
icfaust Nov 8, 2024
8d3cbc6
missing namespace
icfaust Nov 8, 2024
db8e5b6
whitespace
icfaust Nov 8, 2024
5e7946d
forgotten header change
icfaust Nov 8, 2024
4cdae49
oops
icfaust Nov 8, 2024
45482f3
spmd fixes
icfaust Nov 8, 2024
034c5c8
recognized problem
icfaust Nov 8, 2024
675eea7
recognized problem
icfaust Nov 8, 2024
5193b9a
add new header
icfaust Nov 8, 2024
cacd8d7
add whitespace
icfaust Nov 8, 2024
3ddbf5b
remove unnecessary include
icfaust Nov 8, 2024
fadf40d
remove whitespace
icfaust Nov 8, 2024
b182a14
refactor again
icfaust Nov 8, 2024
565ccef
maybe this is better
icfaust Nov 9, 2024
d993182
oops
icfaust Nov 9, 2024
92f507e
oops
icfaust Nov 9, 2024
6f17e92
add native constructors
icfaust Nov 9, 2024
77e020a
fix mistake
icfaust Nov 9, 2024
e72d916
sigh matching implementation
icfaust Nov 9, 2024
51126d7
removing unnecessary lambda
icfaust Nov 9, 2024
368895a
removing unnecessary lambda
icfaust Nov 9, 2024
a14206b
Update deselected_tests.yaml
icfaust Nov 9, 2024
380d2d3
revert change to spmd
icfaust Nov 10, 2024
660aaf1
Update sycl_interfaces.cpp
icfaust Nov 12, 2024
7bdb533
Update deselected_tests.yaml
icfaust Nov 12, 2024
e59b4c5
Update onedal/_device_offload.py
icfaust Nov 12, 2024
f3c6a80
Update _device_offload.py
icfaust Nov 13, 2024
781271b
add tests and fix errors observed in CI
icfaust Nov 14, 2024
589c5d9
merge upstream
icfaust Nov 14, 2024
a55a56f
add missing file
icfaust Nov 14, 2024
4cb1564
fix error in _is_dpc_backend
icfaust Nov 14, 2024
1f287fe
fix coding issues
icfaust Nov 14, 2024
b4d137c
fix one of many mistakes
icfaust Nov 14, 2024
563612f
Merge branch 'intel:main' into dev/dummysyclqueue
icfaust Nov 14, 2024
6710772
switch to filter_string
icfaust Nov 14, 2024
f2161cb
Merge branch 'dev/dummysyclqueue' of https://github.com/icfaust/sciki…
icfaust Nov 14, 2024
91522f4
Update test_sycl.py
icfaust Nov 14, 2024
e71e7c0
Update test_sycl.py
icfaust Nov 14, 2024
89e369b
Update test_sycl.py
icfaust Nov 14, 2024
e237c2c
formatting
icfaust Nov 15, 2024
d43ca05
add requested comments
icfaust Nov 15, 2024
b083ff3
remove filter_string checks and add comment
icfaust Nov 15, 2024
4e3e82d
remove unneccessary comment out
icfaust Nov 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 6 additions & 28 deletions onedal/_device_offload.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,37 +29,17 @@
from dpctl import SyclQueue
from dpctl.memory import MemoryUSMDevice, as_usm_memory
from dpctl.tensor import usm_ndarray
else:
import onedal

SyclQueue = getattr(onedal._backend, "SyclQueue", None)

if dpnp_available:
import dpnp

from .utils._array_api import _convert_to_dpnp


class DummySyclQueue:
"""This class is designed to act like dpctl.SyclQueue
to allow device dispatching in scenarios when dpctl is not available"""

class DummySyclDevice:
def __init__(self, filter_string):
self._filter_string = filter_string
self.is_cpu = "cpu" in filter_string
self.is_gpu = "gpu" in filter_string
self.has_aspect_fp64 = self.is_cpu

if not (self.is_cpu):
logging.warning(
"Device support is limited. "
"Please install dpctl for full experience"
)

def get_filter_string(self):
return self._filter_string

def __init__(self, filter_string):
self.sycl_device = self.DummySyclDevice(filter_string)


def _copy_to_usm(queue, array):
if not dpctl_available:
raise RuntimeError(
Expand Down Expand Up @@ -139,12 +119,10 @@ def _transfer_to_host(queue, *data):
def _get_global_queue():
target = _get_config()["target_offload"]

QueueClass = DummySyclQueue if not dpctl_available else SyclQueue

if target != "auto":
if isinstance(target, QueueClass):
if isinstance(target, SyclQueue):
return target
return QueueClass(target)
return SyclQueue(target)
return None


Expand Down
4 changes: 0 additions & 4 deletions onedal/common/_policy.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,12 +48,8 @@ def __init__(self):


if _is_dpc_backend:
from onedal._device_offload import DummySyclQueue

class _DataParallelInteropPolicy(_backend.data_parallel_policy):
def __init__(self, queue):
self._queue = queue
if isinstance(queue, DummySyclQueue):
super().__init__(self._queue.sycl_device.get_filter_string())
return
super().__init__(self._queue)
66 changes: 0 additions & 66 deletions onedal/common/device_lookup.cpp

This file was deleted.

31 changes: 0 additions & 31 deletions onedal/common/device_lookup.hpp

This file was deleted.

28 changes: 22 additions & 6 deletions onedal/common/policy.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
*******************************************************************************/

#include "oneapi/dal/detail/policy.hpp"
#include "onedal/common/policy_common.hpp"
#include "onedal/common/policy.hpp"
#include "onedal/common/pybind11_helpers.hpp"

namespace py = pybind11;
Expand All @@ -41,12 +41,28 @@ void instantiate_default_host_policy(py::module& m) {

#ifdef ONEDAL_DATA_PARALLEL

using data_parallel_policy_t = dal::detail::data_parallel_policy;
using dp_policy_t = dal::detail::data_parallel_policy;

dp_policy_t make_dp_policy(std::uint32_t id) {
sycl::queue queue = get_queue_by_device_id(id);
return dp_policy_t{ std::move(queue) };
}

dp_policy_t make_dp_policy(const py::object& syclobj) {
sycl::queue queue = get_queue_from_python(syclobj);
return dp_policy_t{ std::move(queue) };
}

dp_policy_t make_dp_policy(const std::string& filter) {
sycl::queue queue = get_queue_by_filter_string(filter);
return dp_policy_t{ std::move(queue) };
}

void instantiate_data_parallel_policy(py::module& m) {
constexpr const char name[] = "data_parallel_policy";
py::class_<data_parallel_policy_t> policy(m, name);
policy.def(py::init<data_parallel_policy_t>());
py::class_<dp_policy_t> policy(m, name);
policy.def(py::init<dp_policy_t>());
policy.def(py::init<const sycl::queue&>());
policy.def(py::init([](std::uint32_t id) {
return make_dp_policy(id);
}));
Expand All @@ -56,10 +72,10 @@ void instantiate_data_parallel_policy(py::module& m) {
policy.def(py::init([](const py::object& syclobj) {
return make_dp_policy(syclobj);
}));
policy.def("get_device_id", [](const data_parallel_policy_t& policy) {
policy.def("get_device_id", [](const dp_policy_t& policy) {
return get_device_id(policy);
});
policy.def("get_device_name", [](const data_parallel_policy_t& policy) {
policy.def("get_device_name", [](const dp_policy_t& policy) {
return get_device_name(policy);
});
m.def("get_used_memory", &get_used_memory, py::return_value_policy::take_ownership);
Expand Down
62 changes: 62 additions & 0 deletions onedal/common/policy.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
/*******************************************************************************
* Copyright 2024 Intel Corporation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*******************************************************************************/

#pragma once

#include <string>
#include <cstdint>

#ifdef ONEDAL_DATA_PARALLEL
#include <sycl/sycl.hpp>
#endif // ONEDAL_DATA_PARALLEL

#include <pybind11/pybind11.h>

#include "oneapi/dal/detail/policy.hpp"

#include "onedal/common/sycl_interfaces.hpp"

namespace py = pybind11;

namespace oneapi::dal::python {

#ifdef ONEDAL_DATA_PARALLEL

using dp_policy_t = detail::data_parallel_policy;

dp_policy_t make_dp_policy(std::uint32_t id);
dp_policy_t make_dp_policy(const py::object& syclobj);
dp_policy_t make_dp_policy(const std::string& filter);
inline dp_policy_t make_dp_policy(const dp_policy_t& policy) {
return dp_policy_t{ policy };
}

#endif // ONEDAL_DATA_PARALLEL

template <typename Policy>
inline auto& instantiate_host_policy(py::class_<Policy>& policy) {
policy.def(py::init<>());
policy.def(py::init<Policy>());
policy.def("get_device_id", [](const Policy&) -> std::uint32_t {
return std::uint32_t{ 0u };
});
policy.def("get_device_name", [](const Policy&) -> std::string {
return std::string{ "cpu" };
});
return policy;
}

} // namespace oneapi::dal::python
2 changes: 1 addition & 1 deletion onedal/common/spmd_policy.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
#include "oneapi/dal/detail/spmd_policy.hpp"
#include "oneapi/dal/spmd/mpi/communicator.hpp"

#include "onedal/common/policy_common.hpp"
#include "onedal/common/policy.hpp"
#include "onedal/common/pybind11_helpers.hpp"

namespace py = pybind11;
Expand Down
77 changes: 77 additions & 0 deletions onedal/common/sycl.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
/*******************************************************************************
* Copyright 2024 Intel Corporation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*******************************************************************************/

#include "onedal/common/sycl_interfaces.hpp"
#include "onedal/common/pybind11_helpers.hpp"

namespace py = pybind11;

namespace oneapi::dal::python {

#ifdef ONEDAL_DATA_PARALLEL

void instantiate_sycl_interfaces(py::module& m){
py::class_<sycl::queue> syclqueue(m, "SyclQueue");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please write a comment about the purpose of this class. I.e. that it implements sycl queue interface in case dpctl's sycl queue is not available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, let me know what you think

syclqueue.def(py::init<const sycl::device&>())
.def(py::init([](const std::string& filter) {
return get_queue_by_filter_string(filter);
})
)
.def(py::init([](const py::int_& obj) {
return get_queue_by_pylong_pointer(obj);
})
Comment on lines +37 to +39
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please share the case when it is needed? Does it covered by tests?

Copy link
Contributor Author

@icfaust icfaust Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment somewhere that this is needed to accept pytorch tensors.
I think the better place for the comment is along with the function's definition. But it's up to you to decide about the comment's placement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now left a note @Vika-F please let me know what you think!

)
.def(py::init([](const py::object& syclobj) {
return get_queue_from_python(syclobj);
})
)
.def("_get_capsule",[](const sycl::queue& queue) {
return pack_queue(std::make_shared<sycl::queue>(queue));
}
)
.def_property_readonly("sycl_device", &sycl::queue::get_device);

// expose limited sycl device features to python for oneDAL analysis
py::class_<sycl::device> sycldevice(m, "SyclDevice");
sycldevice.def(py::init([](std::uint32_t id) {
return get_device_by_id(id).value();
})
)
.def_property_readonly("has_aspect_fp64",[](const sycl::device& device) {
return device.has(sycl::aspect::fp64);
}
)
.def_property_readonly("has_aspect_fp16",[](const sycl::device& device) {
return device.has(sycl::aspect::fp16);
}
)
.def_property_readonly("filter_string",[](const sycl::device& device) {
// assumes we are not working with accelerators
std::string filter = get_device_name(device);
py::int_ id(get_device_id(device).value());
return py::str(filter + ":") + py::str(id);
}
)
.def_property_readonly("is_cpu", &sycl::device::is_cpu)
.def_property_readonly("is_gpu", &sycl::device::is_gpu);
}

ONEDAL_PY_INIT_MODULE(sycl) {
instantiate_sycl_interfaces(m);
}
#endif

} // namespace oneapi::dal::python
Loading
Loading