hal9001
The Scalable Highly Adaptive Lasso
Authors: Jeremy Coyle, Nima Hejazi, Rachael Phillips, Lars van der Laan, and Mark van der Laan
hal9001
?hal9001
is an R package providing an implementation of
the scalable highly adaptive lasso (HAL), a nonparametric
regression estimator that applies L1-regularized lasso regression to a
design matrix composed of indicator functions corresponding to the
support of the functional over a set of covariates and interactions
thereof. HAL regression allows for arbitrarily complex functional forms
to be estimated at fast (near-parametric) convergence rates under only
global smoothness assumptions (van der Laan 2017a; Bibaut and van der
Laan 2019). For detailed theoretical discussions of the highly adaptive
lasso estimator, consider consulting, for example, van der Laan (2017a),
van der Laan (2017b), and van der Laan and Bibaut (2017). For a
computational demonstration of the versatility of HAL regression, see
Benkeser and van der Laan (2016). Recent theoretical works have
demonstrated success in building efficient estimators of complex
parameters when particular variations of HAL regression are used to
estimate nuisance parameters (e.g., van der Laan, Benkeser, and Cai
2019; Ertefaie, Hejazi, and van der Laan 2020).
For standard use, we recommend installing the package from CRAN via
install.packages("hal9001")
To contribute, install the development version of
hal9001
from GitHub via remotes
:
::install_github("tlverse/hal9001") remotes
If you encounter any bugs or have any specific feature requests, please file an issue.
Consider the following minimal example in using hal9001
to generate predictions via Highly Adaptive Lasso regression:
# load the package and set a seed
library(hal9001)
#> Loading required package: Rcpp
#> hal9001 v0.4.3: The Scalable Highly Adaptive Lasso
#> note: fit_hal defaults have changed. See ?fit_hal for details
set.seed(385971)
# simulate data
<- 100
n <- 3
p <- matrix(rnorm(n * p), n, p)
x <- x[, 1] * sin(x[, 2]) + rnorm(n, mean = 0, sd = 0.2)
y
# fit the HAL regression
<- fit_hal(X = x, Y = y, yolo = TRUE)
hal_fit #> [1] "I'm sorry, Dave. I'm afraid I can't do that."
$times
hal_fit#> user.self sys.self elapsed user.child sys.child
#> enumerate_basis 0.008 0.00 0.008 0 0
#> design_matrix 0.003 0.00 0.003 0 0
#> reduce_basis 0.000 0.00 0.000 0 0
#> remove_duplicates 0.000 0.00 0.000 0 0
#> lasso 3.012 0.01 3.023 0 0
#> total 3.024 0.01 3.035 0 0
# training sample prediction
<- predict(hal_fit, new_data = x)
preds mean(hal_mse <- (preds - y)^2)
#> [1] 0.03754093
Contributions are very welcome. Interested contributors should consult our contribution guidelines prior to submitting a pull request.
After using the hal9001
R package, please cite both of
the following:
@software{coyle2022hal9001-rpkg,
author = {Coyle, Jeremy R and Hejazi, Nima S and Phillips, Rachael V
and {van der Laan}, Lars and {van der Laan}, Mark J},
title = {{hal9001}: The scalable highly adaptive lasso},
year = {2022},
url = {https://doi.org/10.5281/zenodo.3558313},
doi = {10.5281/zenodo.3558313}
note = {{R} package version 0.4.2}
}
@article{hejazi2020hal9001-joss,
author = {Hejazi, Nima S and Coyle, Jeremy R and {van der Laan}, Mark
J},
title = {{hal9001}: Scalable highly adaptive lasso regression in
{R}},
year = {2020},
url = {https://doi.org/10.21105/joss.02526},
doi = {10.21105/joss.02526},
journal = {Journal of Open Source Software},
publisher = {The Open Journal}
}
© 2017-2022 Jeremy R. Coyle & Nima S. Hejazi
The contents of this repository are distributed under the GPL-3
license. See file LICENSE
for details.
Benkeser, David, and Mark J van der Laan. 2016. “The Highly Adaptive Lasso Estimator.” In 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA). IEEE. https://doi.org/10.1109/dsaa.2016.93.
Bibaut, Aurélien F, and Mark J van der Laan. 2019. “Fast Rates for Empirical Risk Minimization over Càdlàg Functions with Bounded Sectional Variation Norm.” https://arxiv.org/abs/1907.09244.
Ertefaie, Ashkan, Nima S Hejazi, and Mark J van der Laan. 2020. “Nonparametric Inverse Probability Weighted Estimators Based on the Highly Adaptive Lasso.” https://arxiv.org/abs/2005.11303.
van der Laan, Mark J. 2017a. “A Generally Efficient Targeted Minimum Loss Based Estimator Based on the Highly Adaptive Lasso.” The International Journal of Biostatistics. https://doi.org/10.1515/ijb-2015-0097.
———. 2017b. “Finite Sample Inference for Targeted Learning.” https://arxiv.org/abs/1708.09502.
van der Laan, Mark J, David Benkeser, and Weixin Cai. 2019. “Efficient Estimation of Pathwise Differentiable Target Parameters with the Undersmoothed Highly Adaptive Lasso.” https://arxiv.org/abs/1908.05607.
van der Laan, Mark J, and Aurélien F Bibaut. 2017. “Uniform Consistency of the Highly Adaptive Lasso Estimator of Infinite-Dimensional Parameters.” https://arxiv.org/abs/1709.06256.