missoNet
is an R package that fits penalized multi-task
regression – that is, with multiple correlated tasks or response
variables – to simultaneously estimate the coefficients of a set of
predictor variables for all tasks and the conditional response network
structure given all predictors, via penalized maximum likelihood in an
undirected conditional Gaussian graphical model. In contrast to most
penalized multi-task regression (conditional graphical lasso) methods,
missoNet
has the capability of obtaining estimates even
when the response data is corrupted by missing values. The method
automatically enjoys the theoretical and computational benefits of
convexity, and returns solutions that are comparable/close to the
estimates without any missing values.
The package provides an integrated set of core routines including 1)
data simulation; 2) model fitting and cross-validation; 3) visualization
of results; 4) predictions in new data. The function arguments are in
the same style as those of glmnet
, making it easy for
experienced users to get started.
To install the package missoNet
from CRAN, type the
following command in the R console:
install.packages("missoNet")
Or install the development version of missoNet
from
GitHub:
if(!require("devtools")) {
install.packages("devtools")
}::install_github("yixiao-zeng/missoNet", build_vignettes = TRUE) devtools
An example of how to use the package:
# Simulate a dataset with response values missing completely at random (MCAR),
# the overall missing rate is around 10%.
<- generateData(n = 300, p = 50, q = 20, rho = 0.1, missing.type = "MCAR")
sim.dat <- 1:240 # training set indices
tr <- 241:300 # test set indices
tst <- sim.dat$X[tr, ] # predictor matrix
X.tr <- sim.dat$Z[tr, ] # corrupted response matrix
Y.tr
# Perform a five-fold cross-validation on the training set.
<- cv.missoNet(X = X.tr, Y = Y.tr, kfold = 5)
cvfit
# Alternatively, compute the cross-validation folds in parallel.
<- parallel::makeCluster(min(parallel::detectCores()-1, 3))
cl <- cv.missoNet(X = X.tr, Y = Y.tr, kfold = 5,
cvfit parallel = TRUE, cl = cl)
::stopCluster(cl)
parallel
# Plot the standardized mean cross-validated errors in a heatmap.
plot(cvfit)
# Extract the estimates at "lambda.min" that gives the minimum cross-validated error.
<- cvfit$est.min$Beta
Beta_hat <- cvfit$est.min$Theta
Theta_hat
# Make predictions of response values on the test set.
<- predict(cvfit, newx = sim.dat$X[tst, ], s = "lambda.min") newy
See the vignette for more detailed information.
vignette("missoNet")