rxStepControl: Control for Stepwise Regression
Description
Various parameters that control aspects of stepwise regression.
Usage
rxStepControl(method = "stepwise", scope = NULL,
maxSteps = 1000, stepCriterion = "AIC",
maxSigLevelToAdd = NULL, minSigLevelToDrop = NULL,
refitEachStep = NULL, keepStepCoefs = FALSE,
scale = 0, k = 2, test = NULL, ... )
Arguments
method
a character string specifying the method of stepwise search:
- "stepwise": bi-directional search.
- "backward": backward elimination.
- "forward": forward selection.
Default is "stepwise" if the scope argument is not missing, otherwise "backward".
scope
either a single formula, or a named list containing components upper and lower, both formulae, defining the range of models to be examined in the stepwise search.
maxSteps
an integer specifying the maximum number of steps to be considered, typically used to stop the process early and the default is 1000.
stepCriterion
a character string specifying the variable selection criterion:
- "AIC": Akaike's information criterion.
- "SigLevel": significance level, the traditional stepwise approach in SAS. This argument is similar to the SELECT option of the GLMSELECT procedure in SAS. Default is "AIC".
maxSigLevelToAdd
a numeric scalar specifying the significance level for adding a variable to the model. This argument is used only when stepCriterion = "SigLevel" and is similar to the SLENTRY option of the GLMSELECT procedure in SAS. The defaults are 0.50 for "forward" and 0.15 for "stepwise".
minSigLevelToDrop
a numeric scalar specifying the significance level for dropping a variable from the model. This argument is used only when stepCriterion = "SigLevel" and is similar to the SLSTAY option of the GLMSELECT procedure in SAS. The defaults are 0.10 for "backward" and 0.15 for "stepwise".
refitEachStep
a logical flag specifying whether or not to refit the model at each step. The default is NULL
, indicating to refit the model at each step for rxLogit
and rxGlm
but not for rxLinMod
.
keepStepCoefs
a logical flag specifying whether or not to keep the model coefficients at each step. If TRUE
, a data.frame stepCoefs
will be returned with the fitted model with rows corresponding to the coefficients and columns corresponding to the iterations. Additional computation may be required to generate the coefficients at each step. Those stepwise coefficients can be visualized by plotting the fitted model with rxStepPlot.
scale
optional numeric scalar specifying the scale parameter of the model. It is used in computing the AIC statistics for selecting the models. The default 0 indicates it should be estimated by maximum likelihood. See "scale" in step for details.
k
optional numeric scalar specifying the weight of the number of equivalent degrees of freedom in computing AIC for the penalty. See "k" in step for details.
test
a character string specifying the test statistic to be included in the results, either "F" or "Chisq". Both test statistics are relative to the original model.
...
additional arguments to be passed directly to the Microsoft R Services Compute Engine.
Details
Stepwise models must be computed on the same dataset in order to be compared so rows with missing values in any of the variables in the upper model are removed before the model fitting starts. Consequently, the stepwise models might be different from the corresponding models fitted with only the selected variables if there are missing values in the data set.
When computing stepwise models with rxLogit
or rxGlm
, you can
sometimes improve the speed and quality of the fitting by setting returnAlways=TRUE
in the initial rxLogit
or rxGlm
call. When returnAlways=TRUE
,
rxLogit
and rxGlm
always return the solution tried so far that has the minimum deviance.
Value
A list containing the options.
Author(s)
Microsoft Corporation Microsoft Technical Support
References
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. New York: Springer (4th ed).
Chambers, J. M. and Hastie, T. J. eds (1992) Statistical Models in S. Wadsworth & Brooks/Cole.
Goodnight, J. H. (1979) A Tutorial on the SWEEP Operator. The American Statistician Vol. 33 No. 3, 149--158.
See Also
step, rxStepPlot.
Examples
## setup
form <- Sepal.Length ~ Sepal.Width + Petal.Length
scope <- list(
lower = ~ Sepal.Width,
upper = ~ Sepal.Width + Petal.Length + Petal.Width * Species)
## lm/step
## We need to specify the contrasts for the factor variable Species,
## even though this is not part of the original model. This will
## generate a warning, so we suppress that warning here.
suppressWarnings(rlm.obj <- lm(form, data = iris, contrasts = list(Species = contr.SAS)))
rlm.step <- step(rlm.obj, direction = "both", scope = scope, trace = 1)
## rxLinMod/variableSelection
varsel <- rxStepControl(method = "stepwise", scope = scope)
rxlm.step <- rxLinMod(form, data = iris, variableSelection = varsel,
verbose = 1, dropMain = FALSE, coefLabelStyle = "R")
## compare lm/step and rxLinMod/variableSelection
rlm.step$anova
rxlm.step$anova
as.matrix(coef(rlm.step))
as.matrix(coef(rxlm.step))
## rxLinMod/variableSelection with keepStepCoefs = TRUE
varsel <- rxStepControl(method = "stepwise", scope = scope, keepStepCoefs = TRUE)
rxlm.step <- rxLinMod(form, data = iris, variableSelection = varsel,
verbose = 1, dropMain = FALSE, coefLabelStyle = "R")
rxStepPlot(rxlm.step)