Intel® oneAPI Data Analytics Library 2021.4
The release introduces the following changes:
📚 Support Materials
The following additional materials were created:
-
Medium blogs:
-
Anaconda blogs:
-
Oracle blogs:
-
Kaggle kernels:
- [Tabular Playground Series - Jul 2021] Fast RandomForest with sklearnex
- [Tabular Playground Series - Jul 2021] RF with Intel Extension for Scikit-learn
- [Tabular Playground Series - Jul 2021] Stacking with scikit-learn-intelex
- [Tabular Playground Series - Aug 2021] NuSVR with Intel Extension for Sklearn
- [Predict Future Sales] Stacking with scikit-learn-intelex
- [House Prices - Advanced Regression Techniques] NuSVR sklearn-intelex 4x speedup
-
Added demo samples comparing the usage of Intel® Extension for Scikit-learn and the original Scikit-learn for KNN, Logistic Regression, SVM and Random Forest algorithms
🛠️ Library Engineering
- Introduced new functionality for Intel® Extension for Scikit-learn*:
- Enabled patching for all Scikit-learn applications at once:
- You can enable global patching via command line:
python -m sklearnex.glob patch_sklearn
- Or via code:
from sklearnex import patch_sklearn
patch_sklearn(global_patch=True)
- Read more in Intel® Extension for Scikit-learn documentation.
- You can enable global patching via command line:
- Added the support of Python 3.9 for both Intel® Extension for Scikit-learn and daal4py. The packages are available from PyPI and the Intel Channel on Anaconda Cloud.
- Enabled patching for all Scikit-learn applications at once:
- Introduced new oneDAL functionality:
- Added pkg-config support for Linux, macOS, Windows and for static/dynamic, thread/sequential configurations of oneDAL applications.
- Reduced the size of oneDAL library by approximately ~30%.
🚨 What's New
Introduced new oneDAL functionality:
- General:
- Basic statistics (Low order moments) algorithm in oneDAL interfaces
- Result options for kNN Brute-force in oneDAL interfaces: using a single function call to return any combination of responses, indices, and distances
- CPU:
- Sigmoid kernel of SVM algorithm
- Model converter from CatBoost to oneDAL representation
- Louvain Community Detection algorithm technical preview
- Connected Components algorithm technical preview
- Search task and cosine distance for kNN Brute-force
- GPU:
- The full range support of Minkowski distances in kNN Brute-force
Improved oneDAL performance for the following algorithms:
- CPU:
- Decision Forest training and prediction
- Brute-force kNN
- KMeans
- NuSVMs and SVR training
Introduced new functionality in Intel® Extension for Scikit-learn:
- General:
- Enabled the global patching of all Scikit-learn applications
- Provided an integration with dpctl for heterogeneous computing (the support of
dpctl.tensor.usm_ndarray
for input and output) - Extended API with
set_config
andget_config
methods. Added the support oftarget_offload
andallow_fallback_to_host
options for device offloading scenarios - Added the support of
predict_proba
in RandomForestClassifier estimator
- CPU:
- Added the support of Sigmoid kernel in SVM algorithms
- GPU:
- Added binary SVC support with Linear and RBF kernels
Improved the performance of the following scikit-learn estimators via scikit-learn patching:
SVR
algorithm trainingNuSVC
andNuSVR
algorithms trainingRandomForestRegression
andRandomForestClassifier
algorithms training and predictionKMeans
🐛 Bug Fixes
- General:
- Fixed an incorrectly raised exception during the patching of Random Forest algorithm when the number of trees was more than 7000.
- CPU:
- Fixed an accuracy issue in
Random Forest
algorithm caused by the exclusion of constant features. - Fixed an issue in
NuSVC
Multiclass. - Fixed an issue with
KMeans
convergence inconsistency. - Fixed incorrect work of
train_test_split
with specific subset sizes.
- Fixed an accuracy issue in
- GPU:
- Fixed incorrect bias calculation in
SVM
.
- Fixed incorrect bias calculation in
❗ Known Issues
- GPU:
- For most algorithms, performance degradations were observed when the 2021.4 version of Intel® oneAPI DPC++ Compiler was used.
- Examples are failing when run with Visual Studio Solutions on hardware that does not support double precision floating-point operations.