What is the difference between hierarchical and stepwise regressions?
Predictors are entered cumulatively according to some pre-specified order which is dictated in advance by the purpose and logic of the research. The hierarchical model calls for a determination of R-squared and the partial regression coefficients of each variable or set of variables at the stage at which each variable block is added to the multiple regression.
An example given by Tabachnick and Fidell (2007) looks at if people tend to visit health professionals for reasons associated with their mental health which are not stress related. Two predictors measuring stress and mental health are entered in a specific order. In this case a single stress variable is entered in the regression in stage 1 (to indicate its association with visits) and a mental health variable in stage 2 (to indicate how much more it predicts visits over and above that of stress). Other examples are relationships with depression scores and iq tests (entered at stage 2) after controlling for demographics (entered in stage 1) and interaction analyses where higher order interactions may only be added to the model once lower order terms are present. For example the main effects of age and gender are added firstly (stage 1) followed by the age and gender interaction (stage 2) in predicting intelligence.
Hierarchical multiple linear regressions may be carried out using blocks in SPSS by entering stage 1 variables in block 1 and stage k variables in block k. For example suppose we wish to look at previous histories of stress and psychiatric disorders in predicting memory score independently of age and gender then we could enter age and gender in block 1 and the two previous histories in block 2 using the syntax below.
REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT memtot /METHOD=ENTER age gender /METHOD=ENTER prevh_str prev_psych.
Forward stepwise regression programs are designed to select from a group of predictors the one variable at each stage which has the largest semi-partial r-squared, and hence makes the largest contribution to R-squared. (This will also be the variable that has the largest T value.) Such programs typically stop admitting predictors into the equation when no predictor makes a contribution which is statistically significant at a level specified by the user. Thus, the stepwise procedure defines an a posteriori order based solely on a statistical consideration (the statistical significance of semi-partial correlations).
Backwards stepwise regression procedures work in the opposite order. The dependent variable is regressed on all its predictors. If any variables are statistically insignificant, the one making the smallest contribution is dropped (i.e. the variable with the smallest semi-partial r-squared, which will also be the variable with the smallest T value). Then the remaining variables are regressed on Y, and again the one making the smallest contribution is dropped. The procedure continues until all remaining variables are statistically significant. Note that forwards and backwards regression need not produce the same final model.
Cohen J and Cohen P (1983) Applied multiple regression/correlation analysis for the behavioral sciences. Second Edition. Lawrence Erlbaum:Hillsdale, NJ.
Tabachnick BG and Fidell LS (2007) Using multivariate statistics. Fifth Edition. Pearson International Edition:Boston.
Note: Criticisms of stepwise procedures have been made and the use of decision trees in the package, R, is illustrated and proposed in the paper below:
Miller PJ, Lubke GH, McArtor DB and Bergeman CS (2017) Finding structure in data using multivariate tree boosting. Psychological Methods 21(4) 583-602.