Releases: oneapi-src/oneDAL
Intel® oneAPI Data Analytics Library 2023.2.1
The release of Intel® oneAPI Data Analytics Library 2023.2.1 introduces the following changes:
🚨 What's New
- sklearn 1.3 support fixes (1, 2, 3)
- Model builders API update
Intel® oneAPI Data Analytics Library 2023.2.0
The release Intel® oneAPI Data Analytics Library 2023.2.0 introduces the following changes:
❌ Deprecation Notice
- The compression functionality in the Intel® oneDAL library is deprecated. Starting with the 2024.0 release, oneDAL will not support the compression functionality
- The DAAL CPP SYCL Interfaces in the Intel® oneDAL library are deprecated. Starting with the 2024.0 release, oneDAL will not support the DAAL CPP SYCL Interfaces
- The Java* interfaces in the Intel® oneDAL library are marked as deprecated. The future releases of the oneDAL library may no longer include support for these Java* interfaces
- ABI compatibility is to be broken as part of the 2024.0 release of Intel® oneDAL. The library’s major version is to be incremented to two to enforce the relinking of existing applications
- macOS* support is deprecated for oneDAL. The 2023.x releases are the last to provide it
🛠️ Library Engineering
- CSR tables interface has been changed and moved from detail namespace
🚨 What's New
- Introduced new Intel® oneDAL functionality:
- Distributed KMeans++ algorithm
- Logistic Loss objective algorithm
- Introduced new functionality for Intel® Extension for Scikit-learn:
- NaN(missing values) support was added to Model Builders
- Improved performance for the following Intel® Extension for Scikit-learn algorithms:
- Model Builders performance has been improved up to 2x
Intel® oneAPI Data Analytics Library 2023.1.1
The release Intel® oneAPI Data Analytics Library 2023.1.1 introduces the following changes:
🚨 What's New
Intel® oneAPI Data Analytics Library 2023.1.0
The release Intel® oneAPI Data Analytics Library 2023.1 introduces the following changes:
📚Support Materials
🛠️ Library Engineering
- Reduced the size of Intel® oneDAL library by approximately ~30%
- Enabled NuGet distribution channel for Intel® oneDAL on Linux and MacOS
🚨 What's New
- Introduced new Intel® oneDAL functionality:
- Distributed Linear Regression, kNN, PCA algorithms
- Introduced new functionality for Intel® Extension for Scikit-learn:
- Enabled PCA, Linear Regression, Random Forest algorithms and SPMD policy as preview
- Scikit-learn 1.2 support
- sklearn_is_patched() function added to validate status of algorithms patching
- Improved performance for the following Intel® Extension for Scikit-learn algorithms:
- t-SNE for “Burnes-Hut” algorithm
- SVM algorithm for single row inference
❗ Known Issues
- In certain conditions DAAL SYCL interface might hang with L0 backend – please use oneDAL DPC interfaces instead. If older interfaces are required OpenCL backend can be used as workaround.
Intel® oneAPI Data Analytics Library 2023.0.1
Intel® oneAPI Data Analytics Library 2023.0.0
The release Intel® oneAPI Data Analytics Library 2023.0 introduces the following changes:
🚨 What's New
- Introduced new Intel® oneDAL functionality:
- DPC++ interface for Linear Regression algorithm
❗ Known Issues
- Intel® Extension for Scikit-learn SVC.fit and KNN.fit do not support GPU
- Most Intel® Extension for Scikit-learn sycl examples fail when using GPU context
- Running the Random Forest algorithm with versions 2021.7.1 and 2023.0 of scikit-learn-intelex on the 2nd Generation Intel® Xeon® Scalable Processors, formerly Cascade Lake may result in an 'Illegal instruction' error.
- No workaround is currently available for this issue.
- Recommendation: Use an older version of scikit-learn-intelex until the issue is fixed in a future release.
Intel® oneAPI Data Analytics Library 2021.7.1
The release Intel® oneAPI Data Analytics Library 2021.7.1 introduces the following changes:
📚 Support Materials
- [Tabular Playground Series - Sep 2022] Tuning of ElasticNet hyperparameters
- Accelerated Random Forest for Rent Prediction
🚨 What's New
zlib
andbzip2
methods of compression were deprecated. They are dispatched to thelzo
method starting this version- Optional results (
eigenvectors
,eigenvalues
,variances
andmeans
) andprecomputed
method for PCA algorithm.
Intel® oneAPI Data Analytics Library 2021.6
The release Intel® oneAPI Data Analytics Library 2021.6 introduces the following changes:
📚 Support Materials:
Kaggle kernels for Intel® Extension for Scikit-learn:
- Fast Feature Importance using scikit-learn-intelex
- [Tabular Playground Series - December 2021] Fast Feature Importance with sklearnex
- [Tabular Playground Series - December 2021] SVC with sklearnex 20x speedup
- [Tabular Playground Series - January 2022] Fast PyCaret with Scikit-learn-Intelex
- [Tabular Playground Series - February 2022] KNN with sklearnex 13x speedup
- Fast SVM for Sparse Data from NLP Problem
- Introduction to scikit-learn-intelex
- [Datasets] Fast Feature Importance using sklearnex
- [Tabular Playground Series - March 2022] Fast workflow using scikit-learn-intelex
🛠️ Library Engineering
- Reduced the size of oneDAL python run-time package by approximately 8%
- Added Python 3.10 support for daal4py and Intel(R) Extension for Scikit-learn packages
🚨 What's New
- Improved performance of oneDAL algorithms:
- Optimized data conversion for tables with column-major layout in host memory to tables with row-major layout in device memory
- Optimized the computation of Minkowski distances in brute-force kNN on CPU
- Optimized Covariance algorithm
- Added DPC++ column-wise atomic reduction
- Introduced new oneDAL functionality:
- KMeans distributed random dense initialization
- Distributed PcaCov
sendrecv_replace
communicator method
- Added new parameters to oneDAL algorithms:
- Weights in Decision Forest for CPU
- Cosine and Chebyshev distances for KNN on GPU
Intel® oneAPI Data Analytics Library 2021.5
The release introduces the following changes:
📚 Support Materials
The following additional materials were created:
-
oneDAL samples:
-
Intel® Extension for Scikit-learn samples:
- Demo samples of the Intel® Extension for Scikit-learn usage with the performance comparison to original Scikit-learn for ElasticNet, K-means, Lasso Regression, Linear regression, and Ridge Regression
- Demo samples of the Modin usage
-
daal4py samples:
- An example of Catboost converter usage
-
Kaggle kernels for Intel® Extension for Scikit-learn:
- [Tabular Playground Series - Sep 2021] Ridge with sklearn-intelex 2x speedup
- [Tabular Playground Series - Oct 2021] Fast AutoML with Intel Extension for Scikit-learn
- [Titanic – Machine Learning from Disaster] AutoML with Intel Extension for Sklearn
- [Tabular Playground Series - Nov 2021] AutoML with Intel® Extension
- [Tabular Playground Series - Nov 2021] Log Regression with sklearnex 17x speedup
- [Tabular Playground Series - Dec 2021] SVC with sklearnex 20x speedup
- [Tabular Playground Series - Dec 2021] Fast Feature Importance with sklearnex
🛠️ Library Engineering
- Reduced the size of oneDAL library by approximately ~15%.
🚨 What's New
- Introduced new oneDAL functionality:
- Distributed algorithms for Covariance, DBSCAN, Decision Forest, Low Order Moments
- oneAPI interfaces for Linear Regression, DBSCAN, KNN
- Improved error handling for distributed algorithms in oneDAL in case of compute nodes failures
- Improved performance for the following oneDAL algorithms:
- Louvain algorithm
- KNN and SVM algorithms on GPU
- Introduced new functionality for Intel® Extension for Scikit-learn:
- Scikit-learn 1.0 support
- Fixed the following issues:
- Stabilized the results of Linear Regression in oneDAL and Intel® Extension for Scikit-learn
- Fixed an issue with RPATH on MacOS
Intel® oneAPI Data Analytics Library 2021.4
The release introduces the following changes:
📚 Support Materials
The following additional materials were created:
-
Medium blogs:
-
Anaconda blogs:
-
Oracle blogs:
-
Kaggle kernels:
- [Tabular Playground Series - Jul 2021] Fast RandomForest with sklearnex
- [Tabular Playground Series - Jul 2021] RF with Intel Extension for Scikit-learn
- [Tabular Playground Series - Jul 2021] Stacking with scikit-learn-intelex
- [Tabular Playground Series - Aug 2021] NuSVR with Intel Extension for Sklearn
- [Predict Future Sales] Stacking with scikit-learn-intelex
- [House Prices - Advanced Regression Techniques] NuSVR sklearn-intelex 4x speedup
-
Added demo samples comparing the usage of Intel® Extension for Scikit-learn and the original Scikit-learn for KNN, Logistic Regression, SVM and Random Forest algorithms
🛠️ Library Engineering
- Introduced new functionality for Intel® Extension for Scikit-learn*:
- Enabled patching for all Scikit-learn applications at once:
- You can enable global patching via command line:
python -m sklearnex.glob patch_sklearn
- Or via code:
from sklearnex import patch_sklearn
patch_sklearn(global_patch=True)
- Read more in Intel® Extension for Scikit-learn documentation.
- You can enable global patching via command line:
- Added the support of Python 3.9 for both Intel® Extension for Scikit-learn and daal4py. The packages are available from PyPI and the Intel Channel on Anaconda Cloud.
- Enabled patching for all Scikit-learn applications at once:
- Introduced new oneDAL functionality:
- Added pkg-config support for Linux, macOS, Windows and for static/dynamic, thread/sequential configurations of oneDAL applications.
- Reduced the size of oneDAL library by approximately ~30%.
🚨 What's New
Introduced new oneDAL functionality:
- General:
- Basic statistics (Low order moments) algorithm in oneDAL interfaces
- Result options for kNN Brute-force in oneDAL interfaces: using a single function call to return any combination of responses, indices, and distances
- CPU:
- Sigmoid kernel of SVM algorithm
- Model converter from CatBoost to oneDAL representation
- Louvain Community Detection algorithm technical preview
- Connected Components algorithm technical preview
- Search task and cosine distance for kNN Brute-force
- GPU:
- The full range support of Minkowski distances in kNN Brute-force
Improved oneDAL performance for the following algorithms:
- CPU:
- Decision Forest training and prediction
- Brute-force kNN
- KMeans
- NuSVMs and SVR training
Introduced new functionality in Intel® Extension for Scikit-learn:
- General:
- Enabled the global patching of all Scikit-learn applications
- Provided an integration with dpctl for heterogeneous computing (the support of
dpctl.tensor.usm_ndarray
for input and output) - Extended API with
set_config
andget_config
methods. Added the support oftarget_offload
andallow_fallback_to_host
options for device offloading scenarios - Added the support of
predict_proba
in RandomForestClassifier estimator
- CPU:
- Added the support of Sigmoid kernel in SVM algorithms
- GPU:
- Added binary SVC support with Linear and RBF kernels
Improved the performance of the following scikit-learn estimators via scikit-learn patching:
SVR
algorithm trainingNuSVC
andNuSVR
algorithms trainingRandomForestRegression
andRandomForestClassifier
algorithms training and predictionKMeans
🐛 Bug Fixes
- General:
- Fixed an incorrectly raised exception during the patching of Random Forest algorithm when the number of trees was more than 7000.
- CPU:
- Fixed an accuracy issue in
Random Forest
algorithm caused by the exclusion of constant features. - Fixed an issue in
NuSVC
Multiclass. - Fixed an issue with
KMeans
convergence inconsistency. - Fixed incorrect work of
train_test_split
with specific subset sizes.
- Fixed an accuracy issue in
- GPU:
- Fixed incorrect bias calculation in
SVM
.
- Fixed incorrect bias calculation in
❗ Known Issues
- GPU:
- For most algorithms, performance degradations were observed when the 2021.4 version of Intel® oneAPI DPC++ Compiler was used.
- Examples are failing when run with Visual Studio Solutions on hardware that does not support double precision floating-point operations.