What is the relationship between predictor variable correlations and the presence of a suppressor variable?
A suppressor variable is a variable which reduces the value of a regression estimate when it is added to the model. e.g. in a regression with two predictors X1 and X2 and a response Y: X2 suppresses X1 if B(X1, Y) < B(X1, Y | X2).
Howell (1992, p.507) quotes Darlington (1990) saying that a variable (such as X2) will serve as a suppressor variable when it correlates more highly with Yr than with Y (where Yr represents the residual or part of Y which is unexplained by X1).
Suppressor variables, therefore, may occur even when all the variables (predictors and response) are positively correlated. Below is one such example where X2 is a suppressor variable of X1 in predicting Y. Y, X1 and X2 all have positive zero-order Pearson correlations. The regression coefficient for X1 on its own with the response, Y, is 1.15 and this goes down to 1.01 when X2 is put in the regression so that X2 acts as a suppressor variable on X1.
Y |
X1 |
X2 |
1 |
4 |
2 |
2 |
5 |
4 |
3 |
6 |
5 |
4 |
7 |
2 |
5 |
6 |
8 |
All the zero-order correlations are positive (r(Y,X1)=0.83, r(Y,X2)=0.64 and r(X1,X2)=0.21).
The regression coefficient for X1 only with Y is 1.15.
Term |
B |
Std. Error |
Constant |
-3.46 |
2.53 |
X1 |
1.15 |
0.44 |
The regression coefficient for X1 with Y is reduced (suppressed) by the presence of X2 to 1.01.
Term |
B |
Std. Error |
Constant |
-3.96 |
1.66 |
X1 |
1.01 |
0.30 |
X2 |
0.30 |
0.14 |
We can see that the zero-order correlation between X2 and the residual from the regression of X1 on Y is 0.83 > 0.64= the zero-order correlation between X2 and Y. This means X2 is a more influential predictor of the part of Y which is not related to X1 than Y, itself.
References
Darlington RB (1990). Regression and linear models. McGraw-Hill:New York.
Howell DC (1992). Statistical methods for psychology. Third edition. Wadsworth:Belmont,CA.