As of September 2021: * Minor change to how binning is performed when
num_knots = 1
, ensuring that the minimal number of knots is
chosen when num_knots = 1
. This results in HAL agreeing
with (main terms) glmnet
when
smoothness_orders = 1
and num_knots = 1
. *
Revised formula interface with enhanced capabilities, allowing
specifciation of penalization factors, smoothness_orders, and the number
of knots for each variable, for every single term separately using the
new h
function. It is possible to specify, e.g.,
h(X) + h(W)
which will generate and concatenate the two
basis function terms.
As of April 2021: * The default of fit_hal
is now a
first order smoothed HAL with binning. * Updated documentation for
formula_hal
, fit_hal
and predict
;
and added fit_control
and formula_control
lists for arguments. Moved much of the text to details sections, and
shortened the argument descriptions. * Updated summary
to
support higher-order HAL fit interpretations. * Added checks to
fit_hal
for missingness and dimensionality correspondence
between X
, Y
, and X_unpenalized
.
These checks lead to quickly-produced errors, opposed to enumerating the
basis list and then letting glmnet
error on something
trivial like this. * Modified formula interface in fit_hal
,
so formula
is now provided directly to fit_hal
and formula_hal
is run within fit_hal
. Due to
these changes, it no longer made sense for formula_hal
to
accept data
, so it now takes as input X
. Also,
the formula_fit_hal
function was removed as it is no longer
needed. * Support for the custom lasso procedure implemented in
Rcpp
has been discontinued. Accordingly, the
"lassi"
option and argument fit_type
have been
removed from fit_hal
. * Re-added
lambda.min.ratio
as a fit_control
argument to
fit_hal
. We’ve seen that not setting
lambda.min.ratio
in glmnet
can lead to no
lambda
values that fit the data sufficiently well, so it
seems appropriate to override the glmnet
default.
As of February 2021: * Support higher order HAL via the new
smoothness_orders
argument * smoothness_orders
is a vector of length 1 or length ncol(X)
. * If
smoothness_orders
is of length 1 then its values are
recycled to form a vector of length ncol(X)
. * Given such a
vector of length ncol(X)
, the ith element gives the level
of smoothness for the variable corresponding to the ith column in
X
. * Degree-dependant binning. Higher order terms are
binned more coarsely; the num_knots
argument is a vector up
to max_degree
controlling the degree-specific binning. *
Adds formula_hal
which allows a formula specification of a
HAL model.
As of November 2020: * Allow support for Poisson family to
glmnet()
. * Begins consideration of supporting arbitrary
stats::family()
objects to be passed through to calls to
glmnet()
. * Simplifies output of fit_hal()
by
unifying the redundant hal_lasso
and
glmnet_lasso
slots into the new lasso_fit
slot. * Cleans up of methods throughout and improves documentation,
reducing a few redundancies for cleaner/simpler code in
summary.hal9001
. * Adds link to DOI of the published
Journal of Open Source Software paper in
DESCRIPTION
.
As of September 2020: * Adds a summary
method for
interpreting HAL regressions
(https://github.com/tlverse/hal9001/pull/64). * Adds a software paper
for publication in the Journal of Open Source Software
(https://github.com/tlverse/hal9001/pull/71).
As of June 2020: * Address bugs/inconsistencies reported in the
prediction method when trying to specify a value of lambda not included
in initial fitting. * Addresses a bug arising from a silent failure in
glmnet
in which it ignores the argument
lambda.min.ratio
when family = "gaussian"
is
not set. * Adds a short software paper for submission to JOSS. * Minor
documentation updates.
As of March 2020 * First CRAN release.