Implement Logistic Regression algorithm #2530

avolkov-intel · 2023-09-27T11:31:37Z

Add LogisticRegression algorithm to oneDAL

Changes proposed in this pull request:

Implement LogisticRegression algorithm interface
Implement dense_batch method for solving binary classification problem
Add new convergence tests for newton_cg primitive
Add convergence tests for LogisticRegression algorithm
Add dpc example for LogisticRegression algorithm

- Add control over the number of iterations in while loops - Use l2-norm for convergence checks in cg-solver - Move QuadraticFunction to primitives section

…parameters

… minor fix

…ters

avolkov-intel · 2023-10-30T09:41:15Z

/intelci: run

avolkov-intel · 2023-10-30T11:42:47Z

/intelci: run

cpp/oneapi/dal/algo/logistic_regression/backend/cpu/infer_kernel_dense_batch.cpp

cpp/oneapi/dal/algo/logistic_regression/backend/gpu/infer_kernel_dense_batch_dpc.cpp

Alexandr-Solovev · 2023-10-30T15:15:35Z

cpp/oneapi/dal/algo/logistic_regression/backend/gpu/infer_kernel_dense_batch_dpc.cpp

+}
+
+template <typename Float, typename Task>
+static infer_result<Task> call_dal_kernel(const context_gpu& ctx,


I suggest to avoid using call_dal_kernel name

We have it in many algorithms (in linear_regression, for example) so I do not understand what is the problem with naming

Because this name is not representative in terms of what this kernel does

Alexandr-Solovev · 2023-10-30T15:16:07Z

cpp/oneapi/dal/algo/logistic_regression/backend/gpu/train_kernel_dense_batch_dpc.cpp

+namespace pr = be::primitives;
+
+template <typename Float, typename Task>
+static train_result<Task> call_dal_kernel(const context_gpu& ctx,


the same comment

Alexandr-Solovev · 2023-10-30T15:17:33Z

cpp/oneapi/dal/algo/logistic_regression/backend/gpu/train_kernel_dense_batch_dpc.cpp

+    // TODO: add check if the dataset can be moved to gpu
+    // Move data to gpu
+    pr::ndarray<Float, 2> data_nd = pr::table2ndarray<Float>(queue, data, sycl::usm::alloc::device);
+    table data_gpu = homogen_table::wrap(data_nd.flatten(queue, {}), sample_count, feature_count);


whats the reason to convert gpu ndarray to table? Is it possible to provide data_nd?

cpp/oneapi/dal/algo/logistic_regression/detail/train_ops.cpp

cpp/oneapi/dal/algo/logistic_regression/test/fixture.hpp

cpp/oneapi/dal/algo/logistic_regression/train_types.hpp

cpp/oneapi/dal/backend/primitives/objective_function/logloss.hpp

Alexandr-Solovev · 2023-10-30T15:30:44Z

examples/oneapi/dpc/source/logistic_regression/logistic_regression_dense_batch.cpp

+#include "oneapi/dal/exceptions.hpp"
+#include "example_util/utils.hpp"
+#include <chrono>
+#include <iostream>


Is it included in utils.hpp?

avolkov-intel · 2023-11-02T11:16:49Z

/intelci: run

cpp/oneapi/dal/algo/logistic_regression/backend/cpu/infer_kernel_dense_batch.cpp

KulikovNikita · 2023-10-16T09:30:42Z

cpp/oneapi/dal/algo/logistic_regression/backend/gpu/infer_kernel_dense_batch_dpc.cpp

+template <typename Float>
+std::int64_t propose_block_size(const sycl::queue& q, std::int64_t f, std::int64_t r) {
+    constexpr std::int64_t fsize = sizeof(Float);
+    return 0x10000l * (8 / fsize);
+}


Please change this in the future

It should be changed in the PR.

cpp/oneapi/dal/algo/logistic_regression/backend/gpu/infer_kernel_dense_batch_dpc.cpp

KulikovNikita · 2023-10-16T09:33:27Z

cpp/oneapi/dal/algo/logistic_regression/backend/gpu/infer_kernel_dense_batch_dpc.cpp

+    pr::ndarray<Float, 1> params = pr::table2ndarray_1d<Float>(queue, betas, alloc);
+    pr::ndview<Float, 1> params_suf = fit_intercept ? params : params.slice(1, feature_count);


I would recommend to just use auto. You already explicitly define type just on the right side

I'm not sure if it is excplicitly defined in the second line as params_suf = params may result in having type ndarray. Maybe it is ok, but in order to avoid confusions I would leave it as is

cpp/oneapi/dal/algo/logistic_regression/backend/gpu/infer_kernel_dense_batch_dpc.cpp

cpp/oneapi/dal/algo/logistic_regression/common.hpp

cpp/oneapi/dal/algo/logistic_regression/test/fixture.hpp

KulikovNikita · 2023-10-16T13:47:59Z

cpp/oneapi/dal/algo/logistic_regression/test/fixture.hpp

+        table X_train = homogen_table::wrap<float_t>(X_host_.get_mutable_data(), train_size, p_);
+        table X_test = homogen_table::wrap<float_t>(X_host_.get_mutable_data() + train_size * p_,


Please just use 2 separate tables/ndarrays all the way instead.

I think this would lead to code duplication during generation and predictions, I see no problem in having data stored in one array

cpp/oneapi/dal/algo/logistic_regression/detail/optimizer.cpp

cpp/oneapi/dal/algo/logistic_regression/test/fixture.hpp

KulikovNikita · 2023-11-06T10:20:49Z

cpp/oneapi/dal/algo/logistic_regression/test/fixture.hpp

+        float_t min_train_acc = 0.95;
+        float_t min_test_acc = n_ < 500 ? 0.7 : 0.85;


What should it mean?

For small datasets it is harder to fit model so accuracy threshold should be lower

avolkov-intel · 2023-11-06T12:29:29Z

/intelci: run

samir-nasibli · 2023-11-06T13:34:27Z

cpp/oneapi/dal/algo/logistic_regression/backend/gpu/train_kernel_dense_batch_dpc.cpp

+    // TODO: add check if the dataset can be moved to gpu
+    // Move data to gpu
+    pr::ndarray<Float, 2> data_nd = pr::table2ndarray<Float>(queue, data, sycl::usm::alloc::device);
+    table data_gpu = homogen_table::wrap(data_nd.flatten(queue, {}), sample_count, feature_count);


define alldeps at once

avolkov-intel · 2023-11-06T21:13:57Z

/intelci: run

avolkov-intel · 2023-11-07T11:36:39Z

/intelci: rerun

ethanglaser · 2023-11-07T15:11:49Z

http://intel-ci.intel.com/ee7d7f97-6262-f1ec-8b9e-a4bf010d0e2e

ethanglaser

Please address bazel cpu test fail before merge

ethanglaser · 2023-11-07T16:54:08Z

http://intel-ci.intel.com/ee7d8e2f-12c0-f18b-a8b5-a4bf010d0e2e

cpp/oneapi/dal/algo/logistic_regression/backend/gpu/train_kernel_dense_batch_dpc.cpp

cpp/oneapi/dal/algo/logistic_regression/parameters/cpu/train_parameters.cpp

cpp/oneapi/dal/algo/logistic_regression/parameters/gpu/train_parameters_dpc.cpp

ethanglaser · 2023-11-07T19:11:57Z

cpp/oneapi/dal/algo/newton_cg/common.cpp

would it make more sense to put this in algo or primitives?

ethanglaser · 2023-11-07T19:47:49Z

cpp/oneapi/dal/algo/logistic_regression/BUILD

Generally this is probably the only BUILD file needed within logistic_regression if there are no tests outside of logistic_regression/tests

ethanglaser

Looks good, left a few comments. CI in good shape.

avolkov-intel added 26 commits June 29, 2023 08:57

Initial commit

730c623

Add tests

103b005

Add cg_solver primitive to solve equation Ax = b

fc75c58

Merge branch 'master' into dev/cg-solver

520b811

Move newton_cg primitve to optimizators primitive

d4b97c9

Define newton_cg optimization function

a68c6f0

Add backtracking algorithm for optimal alpha, implement newton_cg solver

7d56a99

Fix errors, add tests for newton-cg

1f36dd6

Remove redundant wait_and_throw, add links to sources

5fb7117

Ensure code stability and fix minor issues

93e428d

- Add control over the number of iterations in while loops - Use l2-norm for convergence checks in cg-solver - Move QuadraticFunction to primitives section

Add sycl::fill, sycl::fabs and add specifiers for virtual functions

9eb51f5

Remove redundant package dependency, update default values for Float …

5c3fa2e

…parameters

Change update_x return type to event_vector, rename test function and…

7d2bdaf

… minor fix

Initial commit

890e1da

Split logloss and derivative functions, decrease the number of parame…

ff71dfb

…ters

Delete redundant compute functions, deselect tests

8f945a1

Add LogLossFunction class and cover it with tests

4d71e82

Merge master

551131d

Fix bugs, rename kernels and remove redundant, update perforamnce tests

1e53e75

Merge branch 'master' into dev/logloss-functor

9fdbe1d

Add wait and throw after gemv events

16a2312

Merge branch 'master' into dev/logloss-functor

10bee38

Minor

3c026da

Fix error and add batch test

a66aab2

Add const qualifier for table with data

6534dac

Merge remote-tracking branch 'origin/master' into dev/logloss-functor

e559daf

avolkov-intel added enhancement dpc++ Issue/PR related to DPC++ functionality new algorithm New algorithm or method in oneDAL testing labels Sep 27, 2023

Merge branch 'master' into dev/logreg-algo

f5872c5

Change fp type to float

fc55338

Alexandr-Solovev reviewed Oct 30, 2023

View reviewed changes

Address comments

243d518

KulikovNikita reviewed Nov 6, 2023

View reviewed changes

Adress changes, change iterations_number to iterations_count

f29d54d

samir-nasibli reviewed Nov 6, 2023

View reviewed changes

Substitute l1, l2 coefs to C - inverse regularization in the interface

be6c81d

Remove class_count from descriptor constructor

7735a5d

KulikovNikita approved these changes Nov 7, 2023

View reviewed changes

Fix typo in result option naming

99e7975

ethanglaser requested changes Nov 7, 2023

View reviewed changes

Add skipif statement to ignore cpu testing

123e398

ethanglaser reviewed Nov 7, 2023

View reviewed changes

cpp/oneapi/dal/algo/logistic_regression/backend/gpu/train_kernel_dense_batch_dpc.cpp Outdated Show resolved Hide resolved

ethanglaser reviewed Nov 7, 2023

View reviewed changes

cpp/oneapi/dal/algo/logistic_regression/parameters/cpu/train_parameters.cpp Outdated Show resolved Hide resolved

ethanglaser reviewed Nov 7, 2023

View reviewed changes

cpp/oneapi/dal/algo/logistic_regression/parameters/gpu/train_parameters_dpc.cpp Outdated Show resolved Hide resolved

ethanglaser reviewed Nov 7, 2023

View reviewed changes

cpp/oneapi/dal/algo/newton_cg/common.cpp Outdated

Copy link

Contributor

ethanglaser Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make more sense to put this in algo or primitives?

ethanglaser reviewed Nov 7, 2023

View reviewed changes

ethanglaser approved these changes Nov 7, 2023

View reviewed changes

Remove redundant includes

8f60536

avolkov-intel merged commit 286f2f7 into oneapi-src:master Nov 8, 2023
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Logistic Regression algorithm #2530

Implement Logistic Regression algorithm #2530

avolkov-intel commented Sep 27, 2023 •

edited

Loading

avolkov-intel commented Oct 30, 2023

avolkov-intel commented Oct 30, 2023

Alexandr-Solovev Oct 30, 2023

avolkov-intel Nov 6, 2023

Alexandr-Solovev Nov 7, 2023

Alexandr-Solovev Oct 30, 2023

Alexandr-Solovev Oct 30, 2023

Alexandr-Solovev Oct 30, 2023

avolkov-intel commented Nov 2, 2023

KulikovNikita Oct 16, 2023

samir-nasibli Nov 6, 2023

KulikovNikita Oct 16, 2023

avolkov-intel Nov 6, 2023

KulikovNikita Oct 16, 2023

avolkov-intel Nov 6, 2023

KulikovNikita Nov 6, 2023

avolkov-intel Nov 6, 2023

avolkov-intel commented Nov 6, 2023

samir-nasibli Nov 6, 2023

avolkov-intel commented Nov 6, 2023

avolkov-intel commented Nov 7, 2023

ethanglaser commented Nov 7, 2023

ethanglaser left a comment

ethanglaser commented Nov 7, 2023

ethanglaser Nov 7, 2023

ethanglaser Nov 7, 2023

ethanglaser left a comment

		pr::ndarray<Float, 1> params = pr::table2ndarray_1d<Float>(queue, betas, alloc);
		pr::ndview<Float, 1> params_suf = fit_intercept ? params : params.slice(1, feature_count);

		table X_train = homogen_table::wrap<float_t>(X_host_.get_mutable_data(), train_size, p_);
		table X_test = homogen_table::wrap<float_t>(X_host_.get_mutable_data() + train_size * p_,

		float_t min_train_acc = 0.95;
		float_t min_test_acc = n_ < 500 ? 0.7 : 0.85;

Implement Logistic Regression algorithm #2530

Implement Logistic Regression algorithm #2530

Conversation

avolkov-intel commented Sep 27, 2023 • edited Loading

Add LogisticRegression algorithm to oneDAL

avolkov-intel commented Oct 30, 2023

avolkov-intel commented Oct 30, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

avolkov-intel commented Nov 2, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

avolkov-intel commented Nov 6, 2023

Choose a reason for hiding this comment

avolkov-intel commented Nov 6, 2023

avolkov-intel commented Nov 7, 2023

ethanglaser commented Nov 7, 2023

ethanglaser left a comment

Choose a reason for hiding this comment

ethanglaser commented Nov 7, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ethanglaser left a comment

Choose a reason for hiding this comment

avolkov-intel commented Sep 27, 2023 •

edited

Loading