A simplified API to Intel® DAAL that allows for fast usage of the framework suited for Data Scientists or Machine Learning users. Built to help provide an abstraction to Intel® DAAL for either direct usage or integration into one's own framework.
Running full scikit-learn test suite with daal4p's optimization patches
With this daal4py API, your Python programs can use Intel® DAAL algorithms in just one line:
kmeans_init(data, 10, t_method="plusPlusDense")
You can even run this on a cluster by simple adding a keyword-parameter
kmeans_init(data, 10, t_method="plusPlusDense", distributed=True)
daal4py is easily built from source with the majority of the necessary prerequisites available on conda. The instructions below detail how to gather the prerequisites, setting one's build environment, and finally building and installing the completed package. daal4py can be built for all three major platforms (Windows, Linux, macOS). Multi-node (distributed) and streaming support can be disabled if desired.
The build-process (using setup.py) happens in 3 stages:
- Creating C++ and cython sources from DAAL C++ headers
- Running cython on generated source
- Compiling and linking
The easiest way to build daal4py is using the conda-build with the provided recipe.
- Python version 2.7 or >= 3.6
- conda-build version >= 3
- C++ compiler with C++11 support
cd <checkout-dir>
conda build conda-recipe -c intel -c conda-forge
This will build the conda package and tell you where to find it (.../daal4py*.tar.bz2
).
conda install <path-to-conda-package-as-built-above>
To actually use your daal4py, dependent packages need to be installed. To ensure, do
Linux and OsX:
conda install -c intel -c conda-forge mpich tbb daal numpy
Windows:
conda install -c intel mpi_rt tbb daal numpy
Without conda-build you have to manually setup your environment before building daal4py.
- Python version 2.7 or >= 3.6
- Jinja2
- Cython
- Numpy
- A C++ compiler with C++11 support
- Intel(R) Threading Building Blocks (Intel® TBB) version 2018.0.4 or later (https://www.threadingbuildingblocks.org/)
- You can use the pre-built conda package from Intel's channel or conda-forge channel on anaconda.org (see below)
- Needed for distributed mode. You can disable support for distributed mode by setting NO_DIST to '1' or 'yes'
- Intel® Data Analytics Acceleration Library (Intel® DAAL) version 2019 or later (https://github.com/01org/daal)
- You can use the pre-built conda package from Intel channel on anaconda.org (see below)
- MPI
- You can use the pre-built conda package intel or conda-forge channel on anaconda.org (see below)
- Needed for distributed mode. You can disable support for distributed mode by setting NO_DIST to '1' or 'yes'
The easiest path for getting cython, DAAL, TBB, MPI etc. is by creating a conda environment and setting environment variables:
conda create -n DAAL4PY python=3.6 impi-devel tbb-devel daal daal-include cython jinja2 numpy clang-tools -c intel -c conda-forge
conda activate DAAL4PY
export TBBROOT=$CONDA_PREFIX
export DAALROOT=$CONDA_PREFIX
export MPIROOT=$CONDA_PREFIX
- DAAL4PY_VERSION: sets package version
- NO_DIST: set to '1', 'yes' or alike to build without support for distributed mode
- NO_STREAM: set to '1', 'yes' or alike to build without support for streaming mode
If building in High Sierra or higher, one may have to run into C++ build errors related to platform targets. Utilize export MACOSX_DEPLOYMENT_TARGET="10.9"
if running into platform target issues.
Requires Intel® DAAL, Intel® TBB and MPI being properly setup, e.g. DAALROOT, TBBROOT and MPIROOT being set.
cd <checkout-dir>
python setup.py build_ext
Requires Intel® DAAL, Intel® TBB and MPI being properly setup, e.g. DAALROOT, TBBROOT and MPIROOT being set.
cd <checkout-dir>
python setup.py install
- sphinx
- sphinx_rtd_theme
- Install daal4py into your python environment
cd doc && make html
- The documentation will be in
doc/_build/html