Clear message

# How do I adjust a correlation for group differences?

Suppose we have a correlation between X and Y and a group variable, G which has two levels. The correlation between X and Y adjusted for group is akin to a hierarchical regression to predict Y with X entered in the first step and group in the second so that the change in R-squared in adding X represents the amount of variance explained in Y by X irrespective of group. The full model may also be regarded as an ANCOVA.

A simpler way of doing this is to correlate X with the residuals formed by regressing group on Y. This is more familiarly expressed as correlating X with Y' where Y' is Y minus its group mean where each person is assumed to be a member of one of the two groups. The significance of the correlation is the same as the significance of the regression coefficient for the covariate in the ANCOVA.

There are two correlations that we can compute, the semi-partial (or part) correlation and the partial correlation. The semi-partial correlation is the correlation between Y and X' where X' is X minus the respective group mean for X. The partial correlation is the correlation between Y' and X' where Y' equals Y minus the respective Y group mean and X' equals X minus the respective X group mean. The choice of which to use depends upon the relative importance of Y and X and, in particular, whether one is a natural predictor of the other. The change in R-squared using the ANCOVA is the square of the semi-partial correlation.

An example (SPSS data spreadsheet is here).

 Y X Group 2 2 1 1 3 1 2 4 1 3 2 1 4 3 2 2 4 2 3 2 2 4 3 2

The group 1 and 2 means are 2.00 and 3.25 for Y and 2.75 and 3.00 respectively for X. Subtracting the respective group means from Y and X we obtain yd and xd so for example for the first observation yd = 2.00 - 2.00 = 0 and xd = 2 - 2.75 = -0.75.

 yd Xd Group 0 -0.75 1 -1.00 0.25 1 0 1.25 1 1.00 -0.75 1 0.75 0 1 -0.25 -1 1 0.75 0 1

The Pearson correlation between xd and y = -0.327 is the semi-partial correlation of X and Y adjusted for group differences.

Hierarchical (ANCOVA) Model Summary

 Predictors R R Square (Constant), group 0.630 0.397 (Constant), group, X 0.710 0.504

The semi-partial correlation above is also equal to the square root of the change in R-squared which from the model summary table above equals $$\sqrt{\mbox{0.504-0.397}}$$ = -0.327 (using the minus signed square root).

The correlation between yd and xd is the partial correlation of X and Y adjusted for group equals -0.421. This may be obtained from the ANCOVA model using the regression procedure and the ZPP option using the STATISTICS subcommand in SPSS using the syntax below:

REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA ZPP
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT y
/METHOD=ENTER group  /METHOD=ENTER x .

The bottom line of the Coefficients table in the SPSS output using this syntax gives the correlations we have calculated above.

 Predictor B Std. Error Beta t Sig. Zero-order r Partial r Part r x -.421 .406 -.331 -1.038 .347 -.222 -.421 -.327