Package website: release | dev
mlr3tuningspaces is a collection of search spaces for hyperparameter optimization in the mlr3 ecosystem. It features ready-to-use search spaces for many popular machine learning algorithms. The search spaces are from scientific articles and work for a wide range of data sets. Currently, we offer tuning spaces from three publications.
Publication | Learner | n Hyperparameter |
---|---|---|
Bischl et al. (2023) | glmnet | 2 |
kknn | 3 | |
ranger | 4 | |
rpart | 3 | |
svm | 4 | |
xgboost | 8 | |
Kuehn et al. (2018) | glmnet | 2 |
kknn | 1 | |
ranger | 8 | |
rpart | 4 | |
svm | 5 | |
xgboost | 13 | |
Binder, Pfisterer, and Bischl (2020) | glmnet | 2 |
kknn | 1 | |
ranger | 6 | |
rpart | 4 | |
svm | 4 | |
xgboost | 10 |
There are several sections about hyperparameter optimization in the mlr3book.
- Getting started with the book section on mlr3tuningspaces.
- Learn about search space.
Install the last release from CRAN:
install.packages("mlr3tuningspaces")
Install the development version from GitHub:
remotes::install_github("mlr-org/mlr3tuningspaces")
A learner passed to the lts()
function arguments the learner with the
default tuning space from Bischl et al. (2023).
library(mlr3tuningspaces)
learner = lts(lrn("classif.rpart"))
# tune learner on pima data set
instance = tune(
tnr("random_search"),
task = tsk("pima"),
learner = learner,
resampling = rsmp("holdout"),
measure = msr("classif.ce"),
term_evals = 10
)
# best performing hyperparameter configuration
instance$result
## cp minbucket minsplit learner_param_vals x_domain classif.ce
## 1: -2.50293 3.110378 1.83171 <list[4]> <list[3]> 0.2148438
The mlr_tuning_spaces
dictionary contains all tuning spaces.
library("data.table")
# print keys and tuning spaces
as.data.table(mlr_tuning_spaces)
A key passed to the lts()
function returns the TuningSpace
.
tuning_space = lts("classif.rpart.rbv2")
tuning_space
## <TuningSpace:classif.rpart.rbv2>: Classification Rpart with RandomBot
## id lower upper levels logscale
## 1: cp 1e-04 1 [NULL] TRUE
## 2: maxdepth 1e+00 30 [NULL] FALSE
## 3: minbucket 1e+00 100 [NULL] FALSE
## 4: minsplit 1e+00 100 [NULL] FALSE
Get the learner with tuning space.
tuning_space$get_learner()
## <LearnerClassifRpart:classif.rpart>: Classification Tree
## * Model: -
## * Parameters: cp=<RangeTuneToken>, maxdepth=<RangeTuneToken>, minbucket=<RangeTuneToken>,
## minsplit=<RangeTuneToken>, xval=0
## * Packages: mlr3, rpart
## * Predict Types: [response], prob
## * Feature Types: logical, integer, numeric, factor, ordered
## * Properties: importance, missings, multiclass, selected_features, twoclass, weights
Tuning spaces can be applied to the learners in a pipeline.
library(mlr3pipelines)
# set default tuning space
graph_learner = as_learner(po("subsample") %>>%
lts(lrn("classif.rpart")))
# set rbv2 tuning space
tuning_space = lts("classif.rpart.rbv2")
graph_learner$graph$pipeops$classif.rpart$param_set$set_values(.values = tuning_space$values)
We are looking forward to new collections of tuning spaces from
peer-reviewed articles. You can suggest new tuning spaces in an issue or
contribute a new collection yourself in a pull request. Take a look at
an already implemented collection e.g. our default tuning
spaces
from Bischl et al. (2023). A TuningSpace
is added to the
mlr_tuning_spaces
dictionary with the add_tuning_space()
function.
Create a tuning space for each variant of the learner e.g. for
LearnerClassifRpart
and LearnerRegrRpart
.
vals = list(
minsplit = to_tune(2, 64, logscale = TRUE),
cp = to_tune(1e-04, 1e-1, logscale = TRUE)
)
add_tuning_space(
id = "classif.rpart.example",
values = vals,
tags = c("default", "classification"),
learner = "classif.rpart",
label = "Classification Tree Example"
)
Choose a name that is related to the publication and adjust the documentation.
The reference is added to the bibentries.R
file
bischl_2021 = bibentry("misc",
key = "bischl_2021",
title = "Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges",
author = "Bernd Bischl and Martin Binder and Michel Lang and Tobias Pielok and Jakob Richter and Stefan Coors and Janek Thomas and Theresa Ullmann and Marc Becker and Anne-Laure Boulesteix and Difan Deng and Marius Lindauer",
year = "2021",
eprint = "2107.05847",
archivePrefix = "arXiv",
primaryClass = "stat.ML",
url = "https://arxiv.org/abs/2107.05847"
)
We are happy to help you with the pull request if you have any questions.
Binder, Martin, Florian Pfisterer, and Bernd Bischl. 2020. “Collecting Empirical Data about Hyperparameters for Data Driven AutoML.” https://www.automl.org/wp-content/uploads/2020/07/AutoML_2020_paper_63.pdf.
Bischl, Bernd, Martin Binder, Michel Lang, Tobias Pielok, Jakob Richter, Stefan Coors, Janek Thomas, et al. 2023. “Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges.” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. Wiley Online Library.
Kuehn, Daniel, Philipp Probst, Janek Thomas, and Bernd Bischl. 2018. “Automatic Exploration of Machine Learning Experiments on OpenML.” https://arxiv.org/abs/1806.10961.