The effect of centering covariates

A covariate is centered by subtracting its overall mean from each covariate value. The below also holds for standardising the covariate because centering is performed as a part of standardisation.

Within subjects effect with 2 levels, W1, with a centered covariate (which can be either varying or constant):

N Observations $$y_text{i} = y_text{i1} – y_text{i2}$$ , i = 1,…, N,

with covariate values $$z_text{i1} and z_text{i2}$$.

Sums of squares (SS) (This uses a few results from matrix algebra given here).

SS(W1) = N $$\bar{y}^text{2}$$

SS(W1 x covariate) = $$\frac{\sum_text{i}z_text{i}y_text{i}}{\sum_text{i}z_text{i}^text{2}}$$

SS(subjects x W1) = $$\sum_text{i} y_text{i}² - N \bar{y}²$$

- $$[\sum_text{i}z_text{i}y_text{i}]² / $$

$$ \sum_text{i} z_text{i}²$$

= Total SS – SS(W1) – SS(W1 x covariate)

Notice that when the covariate is centered SS(W1) is the same regardless of the presence of the covariate. It follows, therefore, that SS(W1) is the same regardless of which type of SS is used. The SS(subjects x W1) will decrease with the presence of the W1 x covariate interaction.

The covariance between W1 and the covariate is given by

–$$\sum_text{i}z_{i} $$ /

$$(N\sum_text{i}z² – $$

$$(\sum_text{i}z_text{i})²)$$

x Mean Square error = 0 when the covariate is centered since $$\sum_text{i}z_{i}$$ = 0.

The above SSs are equivalent to a linear regression with one covariate predictor (which gives the covariate x W1 SS) and an intercept (which gives the W1 SS) with the difference between the two levels of the within subject variable as the response.

Therefore the SS(W1) and SS(W1 x covariate) are orthogonal and are unchanged by the presence of the other in the repeated measures anova. The only SS which changes in the anova is the SS(subjects x W1) when a centered covariate is added. This term is reduced by SS(W1 x covariate).

A Suggested Strategy

If the W1 by covariate interaction is statistically significant form the the F ratio comparing SS(W1) to SS(subjects x W1) with the SS(W1 x covariate) term in model. If the W1 by covariate interaction is not statistically significant the F ratio is formed by comparing SS(W1) to the pooled SS(subjects x W1) + SS(W1 x covariate) term.

Since SS(subjects x W1) = Total SS – SS(W1) – SS(W1 x covariate), the Total SS is a constant and SS(W1) and SS(W1 x covariate) are orthogonal SS(subjects x W1) should be the same for SS types I to III in the models containing W1 alone and W1, W1 x covariate when the covariate is centered.

W1 with 3 levels and a centered covariate (either constant or varying within subject)

SS(W1) = SS1 + SS2 where

SS1 is using the SS as above with $$y_text{i} = y_text{i1} – y_text{i3}$$ and

$$z_text{i}$$ and SS2 is using the SS as above with $$y_text{i} = y_text{i1} + y_text{i3} – 2y_text{i2}$$

and $$z_text{i} $$

SS1 has a SS1(constant) and a SS1(covariate) and SS2 similarly has a SS2(constant) and a SS2(covariate).

SS(W1) = $$ $ \frac{\mbox{contrast(Full)}}{\mbox{contrast coefficient for SS1}} $^text{2}$$

$$ \mbox{SS1(constant)} + $$

$$ $ \frac{\mbox{contrast(Full)}}{\mbox{contrast coefficient for SS2}} $^text{2} $$ $$\mbox{SS2(constant)} $$

SS(W1) is independent of the covariate as it only depends on the y contrasts

$$y_text{i} = y_text{i1} – y_text{i3}$$ and $$y_text{i} = y_text{i1} + y_text{i3} – 2y_text{i2}$$.

SS(W1) = $$ N \bar{y_text{1} – y_text{3}}^text{2} + $$

$$1.33 N \bar{y_text{1} + y_text{3} – 2y_text{2}}^text{2}$$

GLM uses a re-scaling of the above:

SS(W1) =

$$N \bar{0.707 y_text{1} – 0.707 y_text{i}}^text{2} + 1.33 N\bar{0.707 y_text{1} + 0.707$$

$$ y_text{3} – 2 (0.707)y_text{2}}^text{2} $$

SS(W1 x covariate) =

$$ $ \frac{\mbox{contrast(Full)}}{\mbox{contrast coefficient for SS1}} $^text{2} $$

$$ \mbox{SS1(covariate)} + $$

$$$ \frac{\mbox{contrast(Full)}}{\mbox{contrast coefficient for SS2}} $^text{2} \mbox{SS2(covariate)} $$

SS(W1 x covariate) =

$$\sum_text{i}\[z_text{i} (y_text{i1} – y_text{i3})\]² / $$

$$\sum_text{i} z_text{i}²$$

+ 1.33 $$\[\sum_text{i}z_text{i}(y_text{i1} + y_text{i3} – 2y_text{i2})\]² / $$

$$ \sum_text{i}z_text{i}²$$

GLM uses a re-scaling of the above:

SS(W1 x covariate) = $$[\sum_text{i}z_text{i}(0.707 y_text{i1} – 0.707 y_text{i3})]² / $$

$$\sum_text{i} z_text{i}²$$

+ 1.33 $$ [\sum_text{i}z_text{i} (0.707y_text{i1} + 0.707y_text{i3} – (0.707) 2y_text{i2})]² / $$

$$ \sum_text{i} z_text{i}²$$

By subtraction:

SS(subjects x W1) = $$(y_text{i1} – y_text{i3})^text{2} + (y_text{i1} + y_text{i3} – 2y_text{i2})text{2} \mbox{– SS(W1) – SS(W1 x covariate)}$$

Example using SPSS GLM on a W1 with 3 levels and a centered covariate

This uses the data here comprising 3 within subject levels y1 to y3 and one constant covariate zx1.

Default 3 x 2 orthogonal (polynomial) transformation matrix (using all 3 levels). Transposed this matrix is of form (-0.707, 0, 0.707; 0.408, -0.816, 0.408)

Consider the two contrasts which make up this matrix: (-0.707,0,0.707) and (0.408, -0.816, 0.408). Since these contrasts are orthogonal the sums of squares from anovas using each of these, SS1 and SS2 respectively, may be summed to give the total sums of squares associated with using both these contrasts together.

Anova 1: response y1-y3; covariate zx1 gives SS1. Uses contrast (0.707, 0, 0.707),

Anova 2: response 0.5(y1 + y3) – y2, covariate zx1 gives SS2 Uses contrast (0.354, -0.707, 0.354)

Ratio of contrast coefficients (full: single binary contrast):

Anova 1: 0.707/0.707 = 1

Anova 2: 0.408/0.354 = 0.814/0.707 = 1.332.

SS1(W1) = 12 x [0.707 x 0.25] x [0.707 x 0.25] = 0.371; SS2(W2) = 12 x 1.332 x [0.707 x 0.458] x [0.707 x 0.458] = 1.68

SS(W1) = SS1(W1) + SS2(W2) = 0.371 + 1.68 = 2.05

SS1(W1 x covariate) = [0.707 x 1.2624] x [0.707 x 1.2624] / 11 = 0.0724

SS2(W1 x covariate) = 1.332 x [0.707 x 0.2572] x [0.707 x 0.2572] / 11 = 0.004

SS(W1 x covariate) = SS1(W1 x covariate) + SS2(W2 x covariate) = 0.0724 + 0.004 = 0.0764.

MRC CBU Wiki

Quick Links

Search Wiki

Page Tools

The effect of centering covariates