-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Logistic Regression algorithm #2530
Implement Logistic Regression algorithm #2530
Conversation
- Add control over the number of iterations in while loops - Use l2-norm for convergence checks in cg-solver - Move QuadraticFunction to primitives section
/intelci: run |
/intelci: run |
cpp/oneapi/dal/algo/logistic_regression/backend/cpu/infer_kernel_dense_batch.cpp
Outdated
Show resolved
Hide resolved
cpp/oneapi/dal/algo/logistic_regression/backend/gpu/infer_kernel_dense_batch_dpc.cpp
Outdated
Show resolved
Hide resolved
} | ||
|
||
template <typename Float, typename Task> | ||
static infer_result<Task> call_dal_kernel(const context_gpu& ctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to avoid using call_dal_kernel
name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have it in many algorithms (in linear_regression, for example) so I do not understand what is the problem with naming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this name is not representative in terms of what this kernel does
namespace pr = be::primitives; | ||
|
||
template <typename Float, typename Task> | ||
static train_result<Task> call_dal_kernel(const context_gpu& ctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same comment
// TODO: add check if the dataset can be moved to gpu | ||
// Move data to gpu | ||
pr::ndarray<Float, 2> data_nd = pr::table2ndarray<Float>(queue, data, sycl::usm::alloc::device); | ||
table data_gpu = homogen_table::wrap(data_nd.flatten(queue, {}), sample_count, feature_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whats the reason to convert gpu ndarray to table? Is it possible to provide data_nd?
#include "oneapi/dal/exceptions.hpp" | ||
#include "example_util/utils.hpp" | ||
#include <chrono> | ||
#include <iostream> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it included in utils.hpp?
/intelci: run |
cpp/oneapi/dal/algo/logistic_regression/backend/cpu/infer_kernel_dense_batch.cpp
Outdated
Show resolved
Hide resolved
template <typename Float> | ||
std::int64_t propose_block_size(const sycl::queue& q, std::int64_t f, std::int64_t r) { | ||
constexpr std::int64_t fsize = sizeof(Float); | ||
return 0x10000l * (8 / fsize); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change this in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be changed in the PR.
cpp/oneapi/dal/algo/logistic_regression/backend/gpu/infer_kernel_dense_batch_dpc.cpp
Outdated
Show resolved
Hide resolved
pr::ndarray<Float, 1> params = pr::table2ndarray_1d<Float>(queue, betas, alloc); | ||
pr::ndview<Float, 1> params_suf = fit_intercept ? params : params.slice(1, feature_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend to just use auto. You already explicitly define type just on the right side
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it is excplicitly defined in the second line as params_suf = params may result in having type ndarray. Maybe it is ok, but in order to avoid confusions I would leave it as is
cpp/oneapi/dal/algo/logistic_regression/backend/gpu/infer_kernel_dense_batch_dpc.cpp
Show resolved
Hide resolved
table X_train = homogen_table::wrap<float_t>(X_host_.get_mutable_data(), train_size, p_); | ||
table X_test = homogen_table::wrap<float_t>(X_host_.get_mutable_data() + train_size * p_, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please just use 2 separate tables/ndarrays all the way instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would lead to code duplication during generation and predictions, I see no problem in having data stored in one array
float_t min_train_acc = 0.95; | ||
float_t min_test_acc = n_ < 500 ? 0.7 : 0.85; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should it mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For small datasets it is harder to fit model so accuracy threshold should be lower
/intelci: run |
// TODO: add check if the dataset can be moved to gpu | ||
// Move data to gpu | ||
pr::ndarray<Float, 2> data_nd = pr::table2ndarray<Float>(queue, data, sycl::usm::alloc::device); | ||
table data_gpu = homogen_table::wrap(data_nd.flatten(queue, {}), sample_count, feature_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
define alldeps at once
/intelci: run |
/intelci: rerun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address bazel cpu test fail before merge
cpp/oneapi/dal/algo/logistic_regression/backend/gpu/train_kernel_dense_batch_dpc.cpp
Outdated
Show resolved
Hide resolved
cpp/oneapi/dal/algo/logistic_regression/parameters/cpu/train_parameters.cpp
Outdated
Show resolved
Hide resolved
cpp/oneapi/dal/algo/logistic_regression/parameters/gpu/train_parameters_dpc.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make more sense to put this in algo or primitives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally this is probably the only BUILD file needed within logistic_regression if there are no tests outside of logistic_regression/tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, left a few comments. CI in good shape.
Add LogisticRegression algorithm to oneDAL
Changes proposed in this pull request: