Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add generic device; Initial support in portBLAS #552

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,9 @@ option(ENABLE_CUFFT_BACKEND "Enable the cuFFT backend for the DFT interface" OFF
option(ENABLE_ROCFFT_BACKEND "Enable the rocFFT backend for the DFT interface" OFF)
option(ENABLE_PORTFFT_BACKEND "Enable the portFFT DFT backend for the DFT interface. Cannot be used with other DFT backends." OFF)

# Generic devices
option(ENABLE_GENERIC_DEVICE "Enable generic devices. Requires the portBLAS backend." OFF)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not keen on adding yet another user option. Could this be automatically enabled whenever portBLAS is enabled and the default tuning target is used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could.

Generic device really means an unsupported device. That means:

  • It may not be tested, so there is a change of getting the wrong result.
  • Errors are likely to come from somewhere other than oneMKL Interfaces.

Consequently, I think any of this functionality should explicitly be opt-in. Additionally, we currently attempt to auto-detect a tuning target if it isn't set. This behavour is probably useful, although admittedtly portBLAS is even more useful for targets that don't have a vendor library already part of oneMKL Interfaces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

admittedtly portBLAS is even more useful for targets that don't have a vendor library already part of oneMKL Interfaces.

Yes that was my main reason to avoid having this option.

Thinking some more about it, if portBLAS always uses the "generic device" even for supported devices then it wouldn't conflict with the other backends anymore. It would make it easier to have portBLAS used as a fallback. The fallback feature won't happen in this PR but I think we will likely want to remove this option in the future so might as well not add it.
I think we could have portBLAS always use the generic device as part of this PR as well.


set(ONEMKL_SYCL_IMPLEMENTATION "dpc++" CACHE STRING "Name of the SYCL compiler")
set(HIP_TARGETS "" CACHE STRING "Target HIP architectures")

Expand Down Expand Up @@ -123,6 +126,11 @@ if (ENABLE_PORTFFT_BACKEND
message(FATAL_ERROR "ENABLE_PORTFFT_BACKEND cannot be enabled at the same time as other DFT backends.")
endif()

if(ENABLE_GENERIC_DEVICE
AND NOT ENABLE_PORTBLAS_BACKEND)
message(FATAL_ERROR "ENABLE_GENERIC_DEVICE requires that the portBLAS backend is enabled.")
endif()

# Define required CXX compilers before project
if(CMAKE_CXX_COMPILER OR NOT ONEMKL_SYCL_IMPLEMENTATION STREQUAL "dpc++")
if(WIN32)
Expand Down
11 changes: 9 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ oneMKL is part of the [UXL Foundation](http://www.uxlfoundation.org).
</tr>
<tr>
<td align="center"><a href="https://github.com/codeplaysoftware/portBLAS"> portBLAS </a></td>
<td align="center">x86 CPU, Intel GPU, NVIDIA GPU, AMD GPU</td>
<td align="center">x86 CPU, Intel GPU, NVIDIA GPU, AMD GPU, Other SYCL devices (unsupported)</td>
</tr>
<tr>
<td align="center"><a href="https://github.com/codeplaysoftware/portFFT"> portFFT </a></td>
Expand Down Expand Up @@ -172,7 +172,7 @@ Supported compilers include:
</thead>
<tbody>
<tr>
<td rowspan=9 align="center">BLAS</td>
<td rowspan=10 align="center">BLAS</td>
<td rowspan=3 align="center">x86 CPU</td>
<td align="center">Intel(R) oneMKL</td>
<td align="center">Intel DPC++</br>AdaptiveCpp</td>
Expand Down Expand Up @@ -221,6 +221,12 @@ Supported compilers include:
<td align="center">Open DPC++</td>
<td align="center">Dynamic, Static</td>
</tr>
<tr>
<td rowspan=1 align="center">Other SYCL devices (unsupported)</td>
<td align="center">portBLAS</td>
<td align="center">Intel DPC++</br>Open DPC++</td>
<td align="center">Dynamic, Static</td>
</tr>
<tr>
<td rowspan=4 align="center">LAPACK</td>
<td align="center">x86 CPU</td>
Expand Down Expand Up @@ -405,6 +411,7 @@ Supported compilers include:
- Intel(R) Data Center GPU Max Series
- NVIDIA(R) A100 (Linux* only)
- AMD(R) GPUs see [here](https://github.com/RadeonOpenCompute/ROCm#hardware-and-software-support) tested on AMD Vega 20 (gfx906)
- Other SYCL devices can be used, but are not supported

---
### Supported Operating Systems
Expand Down
36 changes: 36 additions & 0 deletions docs/building_the_project_with_dpcpp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,9 @@ The most important supported build options are:
* - ENABLE_PORTFFT_BACKEND
- True, False
- False
* - ENABLE_GENERIC_DEVICE
- True, False
- False
* - BUILD_FUNCTIONAL_TESTS
- True, False
- True
Expand Down Expand Up @@ -225,6 +228,23 @@ A few often-used architectures are listed below:
For a host with ROCm installed, the device architecture can be retrieved via the
``rocminfo`` tool. The architecture will be displayed in the ``Name:`` row.

.. _build_for_other_SYCL_devices:

Building for other SYCL devices
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

SYCL enables portable heterogeneous computing on a wide range of accelerators.
Consequently, it is possible to use oneMKL Interfaces with accelerators not
anticipated by the oneMKL Interfaces team. This can be enabled using the
``-DENABLE_GENERIC_DEVICE=ON`` option. However, this is not a supported
configuration.

For generic SYCL devices, only the portBLAS backend is enabled. The user must
set the appropriate ``-fsycl-targets`` for their device, and also any
``PORTBLAS_TUNING_TARGET`` required for performance. See
`Building for portBLAS`_. Extensive testing is strongly advised for these
unsupported configurations.

.. _build_for_portlibs_dpcpp:

Pure SYCL backends: portBLAS and portFFT
Expand Down Expand Up @@ -408,6 +428,22 @@ set, the backend libraries to enable the use of BLAS, LAPACK and RNG with MKLGPU
and MKLCPU would also be enabled. The build of examples is disabled. Since
functional testing was not disabled, tests would be built.

Build oneMKL for the BLAS domain on a generic SYCL device:

.. code-block:: bash
cmake $ONEMKL_DIR \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_C_COMPILER=clang \
-DENABLE_MKLCPU_BACKEND=False \
-DENABLE_MKLGPU_BACKEND=False \
-DENABLE_PORTBLAS_BACKEND=True \
-DENABLE_GENERIC_DEVICE=True
Note that this is not a supported configuration. This builds oneMKL Interfaces
with the portBLAS backend only, for a generic SYCL device supported by the
Open DPC++ project.

.. _project_cleanup:

Project Cleanup
Expand Down
8 changes: 7 additions & 1 deletion include/oneapi/mkl/detail/backends_table.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@
namespace oneapi {
namespace mkl {

enum class device : uint16_t { x86cpu, intelgpu, nvidiagpu, amdgpu };
enum class device : uint16_t { x86cpu, intelgpu, nvidiagpu, amdgpu, generic_device };
enum class domain : uint16_t { blas, dft, lapack, rng, sparse_blas };

static std::map<domain, std::map<device, std::vector<const char*>>> libraries = {
Expand Down Expand Up @@ -82,6 +82,12 @@ static std::map<domain, std::map<device, std::vector<const char*>>> libraries =
#endif
#ifdef ENABLE_PORTBLAS_BACKEND_NVIDIA_GPU
LIB_NAME("blas_portblas"),
#endif
} },
{ device::generic_device,
{
#ifdef ENABLE_PORTBLAS_BACKEND_GENERIC_DEVICE
LIB_NAME("blas_portblas"),
#endif
} } } },

Expand Down
8 changes: 8 additions & 0 deletions include/oneapi/mkl/detail/get_device_id.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -59,11 +59,19 @@ inline oneapi::mkl::device get_device_id(sycl::queue &queue) {
else if (vendor_id == AMD_ID)
device_id = device::amdgpu;
else {
#ifdef ENABLE_GENERIC_DEVICE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me we could avoid these ifdef and always return a generic_device as a fallback. The table_initializer object should throw backend_not_found the generic_device is used with a backend that does not support it.

device_id = device::generic_device;
#else
throw unsupported_device("", "", queue.get_device());
#endif // ENABLE_GENERIC_DEVICE
}
}
else {
#ifdef ENABLE_GENERIC_DEVICE
device_id = device::generic_device;
#else
throw unsupported_device("", "", queue.get_device());
#endif // ENABLE_GENERIC_DEVICE
}
return device_id;
}
Expand Down
1 change: 1 addition & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ set(ENABLE_PORTBLAS_BACKEND_INTEL_CPU OFF CACHE INTERNAL "")
set(ENABLE_PORTBLAS_BACKEND_INTEL_GPU OFF CACHE INTERNAL "")
set(ENABLE_PORTBLAS_BACKEND_AMD_GPU OFF CACHE INTERNAL "")
set(ENABLE_PORTBLAS_BACKEND_NVIDIA_GPU OFF CACHE INTERNAL "")
set(ENABLE_PORTBLAS_BACKEND_GENERIC_DEVICE OFF CACHE INTERNAL "")
# store path to CMAKE_CURRENT_BINARY_DIR to use it later (makes FetchContent_Declare workable)
set(ONEMKL_GENERATED_INCLUDE_PATH ${CMAKE_CURRENT_BINARY_DIR})

Expand Down
10 changes: 9 additions & 1 deletion src/blas/backends/portblas/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,15 @@ if(NUM_TARGETS EQUAL 0)
list(LENGTH SYCL_TARGETS NUM_TARGETS)
endif()

if(PORTBLAS_TUNING_TARGET)
if(ENABLE_GENERIC_DEVICE)
set(ENABLE_PORTBLAS_BACKEND_GENERIC_DEVICE "ON" CACHE INTERNAL "")
target_compile_options(ONEMKL::SYCL::SYCL INTERFACE -fno-sycl-instrument-device-code)
if(NOT PORTBLAS_TUNING_TARGET)
# If a generic device is specified, set the tuning target to default for best compatibility.
message(STATUS "Setting DEFAULT portBLAS tuning target for generic device.")
set(PORTBLAS_TUNING_TARGET "DEFAULT")
endif()
elseif (PORTBLAS_TUNING_TARGET)
# Allow the user to manually enable a specific device type
# for tuned portBLAS configurations and sets sycl-target.
if(PORTBLAS_TUNING_TARGET STREQUAL "INTEL_CPU")
Expand Down
2 changes: 2 additions & 0 deletions src/config.hpp.in
Original file line number Diff line number Diff line change
Expand Up @@ -32,12 +32,14 @@
#cmakedefine ENABLE_PORTBLAS_BACKEND_INTEL_CPU
#cmakedefine ENABLE_PORTBLAS_BACKEND_INTEL_GPU
#cmakedefine ENABLE_PORTBLAS_BACKEND_NVIDIA_GPU
#cmakedefine ENABLE_PORTBLAS_BACKEND_GENERIC_DEVICE
#cmakedefine ENABLE_PORTFFT_BACKEND
#cmakedefine ENABLE_ROCBLAS_BACKEND
#cmakedefine ENABLE_ROCFFT_BACKEND
#cmakedefine ENABLE_ROCRAND_BACKEND
#cmakedefine ENABLE_ROCSOLVER_BACKEND
#cmakedefine BUILD_SHARED_LIBS
#cmakedefine ENABLE_GENERIC_DEVICE
#cmakedefine REF_BLAS_LIBNAME "@REF_BLAS_LIBNAME@"
#cmakedefine REF_CBLAS_LIBNAME "@REF_CBLAS_LIBNAME@"

Expand Down
Loading