StepReg — Stepwise Regression Analysis
Version: v1.4.4
What is the stepwise regression?
Stepwise regression is a method of fitting regression models in which the choice of predictive variables is carried out by an automatic procedure. Three stepwise regression can be chosen, i.e. stepwise linear regression, stepwise logistic regression and stepwise cox regression.
The main approaches:
Selection Criterion:
AIC/AICc/BIC/Cp/HQ/HQc/Rsq/adjRsq/SBC/SL(Pvalue) based on F test and Approximate F test for linear regression, score test and wald test for logistic regression.
Multicollinearity:
qr matrix decomposition is performed before stepwise regression to detect and remvove variables causing Multicollinearity.
Coding:
We removed Cpp code from this version
#install.package("StepReg")
library(StepReg)
## stepwise linear regression
# remove intercept and add new variable yes which is the same as variable wt in mtcars dataset
data(mtcars)
mtcars$yes <- mtcars$wt
formula <- cbind(mpg,drat) ~ . + 0
stepwise(formula=formula,
data=mtcars,
include=NULL,
selection="bidirection",
select="AIC",
sle=0.15,
sls=0.15,
multivarStat="Pillai",
weights=NULL,
best=NULL)
## stepwsise logistic regression
formula=vs ~ .
stepwiseLogit(formula,
data=mtcars,
include=NULL,
selection="bidirection",
select="SL",
sle=0.15,
sls=0.15,
sigMethod="Rao",
weights=NULL,
best=NULL)
## stepwise cox regression
lung <- survival::lung
my.data <- na.omit(lung)
my.data$status1 <- ifelse(my.data$status==2,1,0)
data <- my.data
formula = Surv(time, status1) ~ . - status
stepwiseCox(formula,
data,
include=NULL,
selection=c("bidirection"),
select="SL",
method=c("efron"),
sle=0.15,
sls=0.15,
weights=NULL,
best=NULL)
Result of multivariate stepwise regression are consistent with the reference * SAS software validation
The final results from this package are validated with SAS software,
data set1 without class effect: 13 dependent variable, 129 independent variable and 216 samples.
data set2 with 4 class effect: 12 dependent variable, 1270 independent variable and 647 samples.
data set3 with 6 class effect: 5 dependent variable, 2068 independent variable and 412 samples.