Data-parallel Extension for Numba* (numba-dpex) is a standalone extension for
the Numba Python JIT compiler. Numba-dpex provides
a generic kernel programming API and an offload feature that extends Numba's
auto-parallelizer to generate data-parallel kernels for parfor
nodes.
Numba-dpex's kernel API has a design and API similar to Numba's cuda.jit
module, but is based on the SYCL language. The
code-generation for the kernel API currently supports
SPIR-V-based
OpenCL and
oneAPI Level Zero
devices that are supported by Intel® DPC++ SYCL compiler runtime. Supported
devices include Intel® CPUs, integrated GPUs and discrete GPUs.
The offload functionality in numba-dpex is based on Numba's parfor
loop-parallelizer. Our compiler extends Numba's parfor
feature to generate
kernels and offload them to devices supported by DPC++ SYCL compiler runtime.
The offload functionality is supported via a new NumPy drop-in replacement
library: dpnp and NumPy-based expressions
and numba.prange
loops are not offloaded.
Refer the documentation and examples to learn more.
Numba-dpex is part of the Intel® Distribution of Python (IDP) and Intel® oneAPI AIKit, and can be installed along with oneAPI. Additionally, we support installing it from Anaconda cloud. Please refer the instructions on our documentation page for more details.
Once the package is installed, a good starting point is to run the examples in
the numba_dpex/examples
directory. The test suite may also be invoked as
follows:
python -m pytest --pyargs numba_dpex.tests
Please create an issue for feature requests and bug reports. You can also use the GitHub Discussions feature for general questions.
If you want to chat with the developers, join the #Data-Parallel-Python_community room on Gitter.im.
Also refer our CONTRIBUTING page.