This page contains information of the boot_MI
method
that is implemented in the psfmi_validate
function of the
psfmi
package.
The method follows the internal validation procedure of the
validate
function in the rms
package for
complete data but now within the context of multiply imputed data. With
the method boot_MI, first bootstrap samples are drawn from the original
incomplete dataset and than multiple imputation is applied in each of
these incomplete bootstrap samples. The pooled model is analyzed in each
bootstrap sample (training data) and subsequently tested in the original
multiply imputed data to determine the amount of optimism. The method
can be performed in combination with backward or forward selection.
How these steps work is visualized in the Figure below.
Schematic overview of the boot_MI method
internal validation is done of the last model that is selected by the
function psfmi_lr
. In the example below,
psfmi_lr
is used with p.crit
set at 1. This
setting is also used in the psfmi_validate
function. This
means that first the full model is pooled and subsequently interval
validation is also done of this full model.
library(psfmi)
<- psfmi_lr(data=lbpmilr, formula = Chronic ~ Pain + JobDemands + rcs(Tampascale, 3) +
pool_lr factor(Satisfaction) + Smoking, p.crit = 1, direction="FW",
nimp=5, impvar="Impnr", method="D1")
set.seed(100)
<- psfmi_validate(pool_lr, val_method = "boot_MI", data_orig = lbp_orig, nboot = 5,
res_MI_boot p.crit=1, nimp_mice = 3, direction = "BW", miceImp = miceImp,
printFlag = FALSE)
##
## Boot 1
##
## Boot 2
##
## Boot 3
##
## Boot 4
##
## Boot 5
##
## p.crit = 1, validation is done without variable selection
res_MI_boot
## $stats_val
## Orig Apparent Test Optimism Corrected
## AUC 0.8871000 0.8902200 0.8799400 0.01028000 0.8768200
## R2 0.5605521 0.5753886 0.5370366 0.03835201 0.5222001
## Brier Scaled 0.4514569 0.4642118 0.4170494 0.04716233 0.4042946
## Slope 1.0000000 1.0000000 0.9095908 0.09040916 0.9095908
##
## $intercept_test
## intercept
## -0.1439701
##
## $res_boot
## ROC_app ROC_test R2_app R2_test Brier_sc_app Brier_sc_test
## Boot 1 0.8431 0.8777 0.4443306 0.5289039 0.3356881 0.4126126
## Boot 2 0.9112 0.8821 0.6287289 0.5477200 0.5180882 0.4264771
## Boot 3 0.8850 0.8825 0.5777050 0.5472948 0.4666613 0.4398712
## Boot 4 0.9073 0.8854 0.6189531 0.5447476 0.5028865 0.4132493
## Boot 5 0.9045 0.8720 0.6072256 0.5165169 0.4977346 0.3930369
## intercept Slope
## Boot 1 -0.11704446 1.2052458
## Boot 2 -0.17911157 0.7351399
## Boot 3 -0.05864258 0.9260111
## Boot 4 -0.28209209 0.8913937
## Boot 5 -0.08295958 0.7901638
##
## $predictors_selected
## Pain JobDemands Smoking factor(Satisfaction) rcs(Tampascale,3)
## Boot 1 1 1 1 1 1
## Boot 2 1 1 1 1 1
## Boot 3 1 1 1 1 1
## Boot 4 1 1 1 1 1
## Boot 5 1 1 1 1 1
##
## $model_orig
## Chronic ~ Pain + JobDemands + Smoking + factor(Satisfaction) +
## rcs(Tampascale, 3)
## <environment: 0x000001db9ebdce70>
Back to Examples
Internal validation is done of the last model that is selected by the
function psfmi_lr
. In the example below,
psfmi_lr
is used with p.crit
set at 1, and
pooling is than done of the full model. Then interval validation is done
with the psfmi_validate
function including BW selection by
setting p.crit=0.05. BW selection is than applied in each bootstrap
sample from the full model of pool_lr. In this way, shrinkage of models
can be performed including backward selection of variables. In this way
a fair shrinkage factor can be determined because variable selection is
responsible for a large amount of overfitting in coefficients.
library(psfmi)
<- psfmi_lr(data=lbpmilr, Outcome="Chronic", predictors = c("Pain", "JobDemands", "Smoking"),
pool_lr cat.predictors = "Satisfaction", spline.predictors = "Tampascale", nknots=3,
p.crit = 1, direction="FW", nimp=5, impvar="Impnr", method="D1")
set.seed(100)
<- psfmi_validate(pool_lr, val_method = "boot_MI", data_orig = lbp_orig, nboot = 5,
res_MI_boot p.crit=0.05, nimp_mice = 3, direction = "BW", miceImp = miceImp,
printFlag = FALSE)
##
## Boot 1
## Removed at Step 1 is - JobDemands
## Removed at Step 2 is - Smoking
## Removed at Step 3 is - rcs(Tampascale,3)
##
## Selection correctly terminated,
## No more variables removed from the model
##
## Boot 2
## Removed at Step 1 is - JobDemands
## Removed at Step 2 is - Smoking
## Removed at Step 3 is - rcs(Tampascale,3)
##
## Selection correctly terminated,
## No more variables removed from the model
##
## Boot 3
## Removed at Step 1 is - JobDemands
## Removed at Step 2 is - Smoking
## Removed at Step 3 is - factor(Satisfaction)
## Removed at Step 4 is - rcs(Tampascale,3)
##
## Selection correctly terminated,
## No more variables removed from the model
##
## Boot 4
## Removed at Step 1 is - Smoking
## Removed at Step 2 is - JobDemands
## Removed at Step 3 is - rcs(Tampascale,3)
##
## Selection correctly terminated,
## No more variables removed from the model
##
## Boot 5
## Removed at Step 1 is - JobDemands
## Removed at Step 2 is - Smoking
##
## Selection correctly terminated,
## No more variables removed from the model
## Removed at Step 1 is - Smoking
## Removed at Step 2 is - JobDemands
## Removed at Step 3 is - rcs(Tampascale,3)
##
## Selection correctly terminated,
## No more variables removed from the model
res_MI_boot
## $stats_val
## Orig Apparent Test Optimism Corrected
## AUC 0.8730000 0.8729200 0.8666600 0.00626000 0.8667400
## R2 0.5244014 0.5324578 0.5097414 0.02271640 0.5016850
## Brier Scaled 0.4384749 0.4319009 0.4094039 0.02249699 0.4159780
## Slope 1.0000000 1.0000000 0.9556648 0.04433523 0.9556648
##
## $intercept_test
## intercept
## -0.1370437
##
## $res_boot
## ROC_app ROC_test R2_app R2_test Brier_sc_app Brier_sc_test
## Boot 1 0.8230 0.8669 0.4070467 0.5121114 0.3136382 0.4027844
## Boot 2 0.8987 0.8730 0.6000170 0.5236933 0.4855851 0.4341447
## Boot 3 0.8566 0.8450 0.4994931 0.4501806 0.4122274 0.3647432
## Boot 4 0.8953 0.8669 0.5832190 0.5200335 0.4883561 0.4224800
## Boot 5 0.8910 0.8815 0.5725132 0.5426883 0.4596977 0.4228672
## intercept Slope
## Boot 1 -0.14270301 1.2315078
## Boot 2 -0.02045524 0.7982327
## Boot 3 -0.27553822 0.8638402
## Boot 4 -0.17098555 0.9449117
## Boot 5 -0.07553651 0.9398314
##
## $predictors_selected
## Pain JobDemands Smoking factor(Satisfaction) rcs(Tampascale,3)
## Boot 1 1 0 0 1 0
## Boot 2 1 0 0 1 0
## Boot 3 1 0 0 0 0
## Boot 4 1 0 0 1 0
## Boot 5 1 0 0 1 1
##
## $model_orig
## Chronic ~ Pain + factor(Satisfaction)
## <environment: 0x000001dba2c30760>
Back to Examples