Releases: tidymodels/hardhat
hardhat 1.4.0
hardhat 1.3.1
- Changed an Rd name from
modeling-package
->modeling-usethis
at the
request of CRAN.
hardhat 1.3.0
-
New family of
spruce_*_multiple()
functions to support standardizing
multi-outcome predictions (#223, with contributions from @cregouby). -
New
fct_encode_one_hot()
that encodes a factor as a one-hot indicator matrix
(#215). -
default_recipe_blueprint()
has gained astrings_as_factors
argument, which
is passed on torecipes::prep()
(#212). -
Using a formula blueprint with
indicators = "none"
and character predictors
now works properly if you provide a character column that only contains a
single value (#213). -
Using a formula blueprint with
indicators = "traditional"
or
indicators = "one_hot"
and character predictors now properly enforces the
factor levels generated by those predictors onnew_data
duringforge()
(#213). -
Using a formula blueprint with
indicators = "none"
now works correctly if
there is a variable in the formula with a space in the name (#217). -
mold()
andforge()
generally have less overhead (#235, #236). -
Added more documentation about importance and frequency weights in
?importance_weights()
and?frequency_weights()
(#214). -
New internal
recompose()
helper (#220).
hardhat 1.2.0
-
We have reverted the change made in hardhat 1.0.0 that caused recipe
preprocessors to drop non-standard roles by default when callingforge()
.
Determining what roles are required atbake()
time is really something
that should be controlled within recipes, not hardhat. This results in the
following changes (#207):-
The new argument,
bake_dependent_roles
, that was added to
default_recipe_blueprint()
in 1.0.0 has been removed. It is no longer
needed with the new behavior. -
By default,
forge()
will pass on all columns fromnew_data
tobake()
except those with roles of"outcome"
or"case_weights"
. With
outcomes = TRUE
, it will also pass on the"outcome"
role. This is
essentially the same as the pre-1.0.0 behavior, and means that, by default,
all non-standard roles are required atbake()
time. This assumption is
now also enforced by recipes 1.0.0, even if you aren't using hardhat or
a workflow. -
In the development version of recipes, which will become recipes 1.0.0,
there is a newupdate_role_requirements()
function that can be used to
declare that a role is not required atbake()
time. hardhat now knows how
to respect that feature, and inforge()
it won't pass on columns of
new_data
tobake()
that have roles that aren't required atbake()
time.
-
hardhat 1.1.0
-
Fixed a bug where the results from calling
mold()
using hardhat < 1.0.0 were
no longer compatible with callingforge()
in hardhat >= 1.0.0. This could
occur if you save a workflow object after fitting it, then load it into an
R session that uses a newer version of hardhat (#200). -
Internal details related to how blueprints work alongside
mold()
and
forge()
were heavily re-factored to support the fix for #200. These changes
are mostly internal or developer focused. They include:-
Blueprints no longer store the clean/process functions used when calling
mold()
andforge()
. These were stored inblueprint$mold$clean()
,
blueprint$mold$process()
,blueprint$forge$clean()
, and
blueprint$forge$process()
and were strictly for internal use. Storing
them in the blueprint caused problems because blueprints created with old
versions of hardhat were unlikely to be compatible with newer versions of
hardhat. This change means thatnew_blueprint()
and the other blueprint
constructors no longer havemold
orforge
arguments. -
run_mold()
has been repurposed. Rather than calling the$clean()
and
$process()
functions (which, as mentioned above, are no longer in the
blueprint), the methods for this S3 generic have been rewritten to directly
call the current versions of the clean and process functions that live in
hardhat. This should result in less accidental breaking changes. -
New
run_forge()
which is aforge()
equivalent torun_mold()
. It
handles the clean/process steps that were previously handled by the
$clean()
and$process()
functions stored directly in the blueprint.
-
hardhat 1.0.0
-
Recipe preprocessors now ignore non-standard recipe roles (i.e. not
"outcome"
or"predictor"
) by default when callingforge()
. Previously,
it was assumed that all non-standard role columns present in the original
training data were also required in the test data whenforge()
is called.
It seems to be more often the case that those columns are actually not
required tobake()
new data, and often won't even be present when making
predictions on new data. For example, a custom"case_weights"
role might be
required for computing case-weighted estimates atprep()
time, but won't
be necessary atbake()
time (since the estimates have already been
pre-computed and stored). To account for the case when you do require a
specific non-standard role to be present atbake()
time,
default_recipe_blueprint()
has gained a new argument,
bake_dependent_roles
, which can be set to a character vector of
non-standard roles that are required. -
New
weighted_table()
for generating a weighted contingency table, similar to
table()
(#191). -
New experimental family of functions for working with case weights. In
particular,frequency_weights()
andimportance_weights()
(#190). -
use_modeling_files()
andcreate_modeling_package()
no longer open the
package documentation file in the current RStudio session (#192). -
rlang >=1.0.2 and vctrs >=0.4.1 are now required.
-
Bumped required R version to
>= 3.4.0
to reflect tidyverse standards.
hardhat 0.2.0
-
Moved
tune()
from tune to hardhat (#181). -
Added
extract_parameter_dials()
andextract_parameter_set_dials()
generics
to extend the family ofextract_*()
generics. -
mold()
no longer misinterprets::
as an interaction term (#174). -
When
indicators = "none"
,mold()
no longer misinterprets factor columns
as being part of an inline function if there is a similarly named non-factor
column also present (#182).
hardhat 0.1.6
-
Added a new family of
extract_*()
S3 generics for extracting important
components from various tidymodels objects. S3 methods will be defined in
other tidymodels packages. For example, tune will register an
extract_workflow()
method to easily extract the workflow embedded within the
result oftune::last_fit()
. -
A logical
indicators
argument is no longer allowed in
default_formula_blueprint()
. This was soft-deprecated in hardhat 0.1.4,
but will now result in an error (#144).
hardhat 0.1.5
-
use_modeling_files()
(and therefore,create_modeling_package()
) now
ensures that all generated functions are templated on the model name. This
makes it easier to add multiple models to the same package (#152). -
All preprocessors can now
mold()
andforge()
predictors to one of three
output formats (either tibble, matrix, ordgCMatrix
sparse matrix) via the
composition
argument of a blueprint (#100, #150).
hardhat 0.1.4
-
Setting
indicators = "none"
indefault_formula_blueprint()
no longer
accidentally expands character columns into dummy variable columns. They
are now left completely untouched and pass through as characters. When
indicators = "traditional"
orindicators = "one_hot"
, character columns
are treated as unordered factors (#139). -
The
indicators
argument ofdefault_formula_blueprint()
now takes character
input rather than logical. To update:indicators = TRUE -> indicators = "traditional" indicators = FALSE -> indicators = "none"
Logical input for
indicators
will continue to work, with a warning, until
hardhat 0.1.6, where it will be formally deprecated.There is also a new
indicators = "one_hot"
option which expands all factor
columns intoK
dummy variable columns corresponding to theK
levels of
that factor, rather than the more traditionalK - 1
expansion.