Public API
fit_evotree
EvoTrees.fit_evotree
— Functionfit_evotree(
+ params::EvoTypes{L},
+ dtrain;
+ target_name,
+ fnames=nothing,
+ w_name=nothing,
+ offset_name=nothing,
+ deval=nothing,
+ metric=nothing,
+ early_stopping_rounds=9999,
+ print_every_n=9999,
+ verbosity=1,
+ return_logger=false,
+ device="cpu")
Main training function. Performs model fitting given configuration params
, dtrain
, target_name
and other optional kwargs.
Arguments
params::EvoTypes
: configuration info providing hyper-paramters.EvoTypes
can be one of:dtrain
: A Tables compatible training data (named tuples, DataFrame...) containing features and target variables.
Keyword arguments
target_name
: name of target variable.fnames = nothing
: the names of thex_train
features. If provided, should be a vector of string withlength(fnames) = size(x_train, 2)
.w_name = nothing
: name of the variable containing weights. Ifnothing
, common weights on one will be used.offset_name = nothing
: name of the offset variable.deval
: A Tables compatible evaluation data containing features and target variables.metric
: The evaluation metric that wil be tracked ondeval
. Supported metrics are::mse
: mean-squared error. Adapted for general regression models.:rmse
: root-mean-squared error (CPU only). Adapted for general regression models.:mae
: mean absolute error. Adapted for general regression models.:logloss
: Adapted for:logistic
regression models.:mlogloss
: Multi-class cross entropy. Adapted toEvoTreeClassifier
classification models.:poisson
: Poisson deviance. Adapted toEvoTreeCount
count models.:gamma
: Gamma deviance. Adapted to regression problem on Gamma like, positively distributed targets.:tweedie
: Tweedie deviance. Adapted to regression problem on Tweedie like, positively distributed targets with probability mass aty == 0
.:gaussian_mle
: Gaussian maximum log-likelihood. Adapted toEvoTreeMLE
models withloss = :gaussian_mle
.:logistic_mle
: Logistic maximum log-likelihood. Adapted toEvoTreeMLE
models withloss = :logistic_mle
.
early_stopping_rounds::Integer
: number of consecutive rounds without metric improvement after which fitting in stopped.print_every_n
: sets at which frequency logging info should be printed.verbosity
: set to 1 to print logging info during training.return_logger::Bool = false
: if set to true (default),fit_evotree
return a tuple(m, logger)
where logger is a dict containing various tracking information.device="cpu"
: Hardware device to use for computations. Can be either"cpu"
or"gpu"
. Following losses are not GPU supported at the moment:l1
,:quantile
,:logistic_mle
.
fit_evotree(
+ params::EvoTypes{L};
+ x_train::AbstractMatrix,
+ y_train::AbstractVector,
+ w_train=nothing,
+ offset_train=nothing,
+ x_eval=nothing,
+ y_eval=nothing,
+ w_eval=nothing,
+ offset_eval=nothing,
+ early_stopping_rounds=9999,
+ print_every_n=9999,
+ verbosity=1)
Main training function. Performs model fitting given configuration params
, x_train
, y_train
and other optional kwargs.
Arguments
params::EvoTypes
: configuration info providing hyper-paramters.EvoTypes
can be one of:
Keyword arguments
x_train::Matrix
: training data of size[#observations, #features]
.y_train::Vector
: vector of train targets of length#observations
.w_train::Vector
: vector of train weights of length#observations
. Ifnothing
, a vector of ones is assumed.offset_train::VecOrMat
: offset for the training data. Should match the size of the predictions.x_eval::Matrix
: evaluation data of size[#observations, #features]
.y_eval::Vector
: vector of evaluation targets of length#observations
.w_eval::Vector
: vector of evaluation weights of length#observations
. Defaults tonothing
(assumes a vector of 1s).offset_eval::VecOrMat
: evaluation data offset. Should match the size of the predictions.metric
: The evaluation metric that wil be tracked onx_eval
,y_eval
and optionallyw_eval
/offset_eval
data. Supported metrics are::mse
: mean-squared error. Adapted for general regression models.:rmse
: root-mean-squared error (CPU only). Adapted for general regression models.:mae
: mean absolute error. Adapted for general regression models.:logloss
: Adapted for:logistic
regression models.:mlogloss
: Multi-class cross entropy. Adapted toEvoTreeClassifier
classification models.:poisson
: Poisson deviance. Adapted toEvoTreeCount
count models.:gamma
: Gamma deviance. Adapted to regression problem on Gamma like, positively distributed targets.:tweedie
: Tweedie deviance. Adapted to regression problem on Tweedie like, positively distributed targets with probability mass aty == 0
.:gaussian_mle
: Gaussian maximum log-likelihood. Adapted toEvoTreeMLE
models withloss = :gaussian_mle
.:logistic_mle
: Logistic maximum log-likelihood. Adapted toEvoTreeMLE
models withloss = :logistic_mle
.
early_stopping_rounds::Integer
: number of consecutive rounds without metric improvement after which fitting in stopped.print_every_n
: sets at which frequency logging info should be printed.verbosity
: set to 1 to print logging info during training.fnames
: the names of thex_train
features. If provided, should be a vector of string withlength(fnames) = size(x_train, 2)
.return_logger::Bool = false
: if set to true (default),fit_evotree
return a tuple(m, logger)
where logger is a dict containing various tracking information.device="cpu"
: Hardware device to use for computations. Can be either"cpu"
or"gpu"
. Following losses are not GPU supported at the moment:l1
,:quantile
,:logistic_mle
.
predict
MLJModelInterface.predict
— Functionpredict(model::EvoTree, X::AbstractMatrix; ntree_limit = length(model.trees))
Predictions from an EvoTree model - sums the predictions from all trees composing the model. Use ntree_limit=N
to only predict with the first N
trees.
importance
EvoTrees.importance
— Functionimportance(model::EvoTree; fnames=model.info[:fnames])
Sorted normalized feature importance based on loss function gain. Feature names associated to the model are stored in model.info[:fnames]
as a string Vector
and can be updated at any time. Eg: model.info[:fnames] = new_fnames_vec
.