Explaining different types of sums of squares (SS) in factorial ANOVAs

There is no consensus about which of the three types of sums of squares to use in factorial anovas when looking at main effects and interactions. A comparison of these sums of squares when used in factorial anovas is given on Gang Chen's (of NIMH) webpage located here. This problem may be overcome by removing any non-significant interactions (using e.g., custom model in SPSS) or using simple effects analyses on individual groups if an interaction is statistically significant.

If this link is broken the details, with a few additions, are reproduced below:

Types of Sums of Squares

With flexibility (especially unbalanced designs) and expansion in mind, this ANOVA package was implemented with general linear model (GLM) approach. There are different ways to quantify factors (categorical variables) by assigning the values of a nominal or ordinal variable, but we adopt binary coding for each factor level and all applicable interactions into dummy (indicator) variables. An ANOVA can be written as a general linear model:

Y = b0 + b1X1 + b2X2 + ... + bkXk+e

With matrix notation,

it is reduced to a simple form Y = Xb + e

The design matrix for a 2-way ANOVA with factorial design 2X3 looks like

Data

Data

Design Matrix

A

B

constant

A1

A2

B1

B2

B3

A1B1

A1B2

A1B3

A2B1

A2B2

A2B3

1

1

1

1

0

1

0

0

1

0

0

0

0

0

1

2

1

1

0

0

1

0

0

1

0

0

0

0

1

3

1

1

0

0

0

1

0

0

1

0

0

0

2

1

1

0

1

1

0

0

0

0

0

1

0

0

2

2

1

0

1

0

1

0

0

0

0

0

1

0

2

3

1

0

1

0

0

1

0

0

0

0

0

1

After removing an effect of a factor or an interaction from the above full model (deleting some columns from matrix X), we obtain the increased error due to the removal as a measure of the effect. And the ratio of this measure relative to some overall error gives an F value, revealing the significance of the effect.

However, there are different approaches to keeping or removing columns of an effect, and sometimes it is a sensitive and controversial issue among statisticians.

Type I: sequential

The SS for each factor is the incremental improvement in the error SS as each factor effect is added to the regression model. In other words it is the effect as the factor were considered one at a time into the model, in the order they are entered in the model selection , for example A, B, C, and D in a 4-way ANOVA. The SS can also be viewed as the reduction in residual sum of squares (SSE) obtained by adding that term to a fit that already includes the terms listed before it. Type I SS is the default in R and GENSTAT.

Pros:

(1) Nice property: balanced or not, SS for all the effects add up to the total SS, a complete decomposition of the predicted sums of squares for the whole model. This is not generally true for any other type of sums of squares.

(2) Preferable when some factors (such as nesting) should be taken out before other factors. For example with unequal number of male and female, factor "gender" should precede "subject" in an unbalanced design.

Cons:

(1) Order matters! Hypotheses depend on the order in which effects are specified. If you fit a 2-way ANOVA with two models, one with A then B, the other with B then A, not only can the type I SS for factor A be different under the two models, but there is NO certain way to predict whether the SS will go up or down when A comes second instead of first.

This lack of invariance to order of entry into the model limits the usefulness of Type I sums of squares for testing hypotheses for certain designs.

(2) Not appropriate for factorial designs

Type II: hierarchical or partially sequential

SS is the reduction in residual error due to adding the term to the model after all other terms except those that contain it, or the reduction in residual sum of squares obtained by adding that term to a model consisting of all other terms that do not contain the term in question. An interaction comes into play only when all involved factors are included in the model. For example, the SS for main effect of factor A is not adjusted for any interactions involving A: AB, AC and ABC, and sums of squares for two-way interactions control for all main effects and all other two-way interactions, and so on.

Pros:

(1) appropriate for model building, and natural choice for regression.

(2) most powerful when there is no interaction

(3) invariant to the order in which effects are entered into the model

(4) invariant to coding of dummy variables comprising the factors. For example, using 1, -1 coding gives the same SS for factor A adjusted for B as using 0, 1 coding (Langsrud, 2003).

Cons:

(1) For factorial designs with unequal cell samples, Type II sums of squares test hypotheses that are complex functions of the cell ns that ordinarily are not meaningful.

(2) Not appropriate for factorial designs

(3) Assumes interactions are small or non-existent

Type III: marginal or orthogonal

SS gives the sum of squares that would be obtained for each variable if it were entered last into the model. That is, the effect of each variable is evaluated after all other factors have been accounted for. Therefore the result for each term is equivalent to what is obtained with Type I analysis when the term enters the model as the last one in the ordering. Type III SS is the default in most statistics packages including SPSS.

Pros:

Not sample size dependent: effect estimates are not a function of the frequency of observations in any group (i.e. for unbalanced data, where we have unequal numbers of observations in each group). When there are no missing cells in the design, these subpopulation means are least squares means, which are the best linear-unbiased estimates of the marginal means for the design.

Cons:

(1) Testing main effects in the presence of interactions. Nelder (1994) and Langsrud (2003) state that main effects will always be present if there is an interaction also present so are not worth looking at if there is a significant interaction.

(2) Not appropriate for designs with missing cells: for ANOVA designs with missing cells, Type III sums of squares generally do not test hypotheses about least squares means, but instead test hypotheses that are complex functions of the patterns of missing cells in higher-order containing interactions and that are ordinarily not meaningful.

(3) The sums of squares (and F ratios) for the main effects will differ depending upon how the dummy variables representing the group factors are coded. For example, using 1, -1 coding for a pair of two group factors, A and B, gives a different SS for factor A adjusted for B and AxB than using 0, 1 factor coding (see 'Constaints on parameters' section in Nelder, 1994).

Type IV: Goodnight or balanced

A variation of type III, but specifally developed for designs with missing cells.

Suppose we have a model with two factors and the terms appear in the order A, B, AB. Let R(·) represent the residual sum of squares for a model, so for example R(A,B,AB) is the residual sum of squares fitting the whole model, R(A) is the residual sum of squares fitting just the main effect of A, and R(1) is the residual sum of squares fitting just the mean. The three types of sums of squares are calculated as follows:

Term

Type 1 SS

Type 2 SS

Type 3 SS

A

SS(A)=R(1)-R(A)

SS(A|B)=R(B)-R(A,B)

SS(A|B,AB)=R(B,AB)-R(A,B,AB)

B

SS(B|A)=R(A)-R(A,B)

SS(B|A)=R(A)-R(A,B)

SS(B|A,AB)=R(A,AB)-R(A,B,AB)

AB

SS(AB|A,B)=R(A,B)-R(A,B,AB)

SS(AB|A,B)=R(A,B)-R(A,B,AB)

SS(AB|A,B)=R(A,B)-R(A,B,AB)

Their relationship:

Effect

Balanced

Unbalanced

Missing Cells

A

I=II=III=IV

III=IV

B

I=II=III=IV

I=II, III=IV

I=II

AB

I=II=III=IV

I=II=III=IV

I=II=III=IV

The type of SS only influences computations on unbalanced data. because for orthogonal designs, it does not matter which type of SS is used since they are essentially the same. The nice thing about balanced designs is that orthogonality protects us from worrying about any potential interference among factors. If possible, balanced designs in group analysis are desirable by all means.

In most ANOVA designs, it is assumed the independents are orthogonal (uncorrelated, independent). This corresponds to the absence of multicollinearity in regression models. If there is such lack of independence, then the ratio of the between to within variances will not follow the F distribution assumed for significance testing.

Only when a design is unbalanced does the type of SS become an issue, thus the controversy over the preference on SS type. Two kinds of unbalanced designs in FMRI group analysis are:

(1) Unequal number of subjects across groups.

(2) Missing cells: Some subjects fail to perform some tasks. Currently only two designs of the first kind are available in the package:

(1) 3-way ANOVA BXC(A) (type 3): C is a random factor nested within factor A while B is a fixed factor;

(2) 4-way ANOVA BXCXD(A) (type 3): D is a random factor nested within factor A while B and C are two fixed factors.

There is NO consensus on which type of SS should be used for unbalanced designs, but most statisticians generally recommend type III, which is the default in most software packages such as SAS, SPSS, JMP, Minitab, Stata, Statista, Systat, and Unistat while R, S-Plus, Genstat, and Mathematica use type I. However, Langsrud (2003) argues that Type II is preferable considering the power of types II and III.

In the two unbalanced designs implemented so far in the Matlab package, both of them are nested/mixed designs, and it makes sense to take type I, having control factor (group) precede the primary factor (subject).

Theoretical reasons aside, there is a practical consideration in this package to adopt Type I SS. All ANOVAs are built on a pure crossed (factorial) design. For example, all other 3-way ANOVA types are calculated from the "seed" design AXBXC with all factors being fixed. In mixed design BXC(A) with A and B fixed, and C (usually subject) random and nested within A, we have

SSBC(A) = SSBC + SSABC, df BC(A) = df BC + df ABC

As mentioned above, this nice structure only holds with Type I SS, and would collapse with other types of SS.

How much is the difference among different types of SS?

Test Data

Level

B1

B2

B3

B4

A1

3

4

7

7

6

5

8

8

3

4

7

9

3

6

8

3

A2

1

2

5

10

2

2

5

10

2

3

6

10

2

4

5

9

3

6

11

Three ANOVA Summary Tables

Type 1

SS

df

MS

F

A

3.125

1

3.125

4.04

B | A

193.931

3

64.644

83.64

AB | A,B

19.894

3

6.631

8.58

Error

18.550

24

0.77

Total

235.500

31

Type 2

SS

df

MS

F

A | B

2.707

1

2.707

3.50

B | A

193.931

3

64.644

83.64

AB | A,B

19.894

3

6.631

8.58

Error

18.550

24

0.77

Total

235.500

31

Type 3

SS

df

MS

F

A | B, AB

3.199

1

3.199

4.14

B | A, AB

188.726

3

62.909

81.83

AB | A,B

19.894

3

6.631

8.58

Error

18.550

24

0.77

Total

235.500

31

Note that in repeated measures ANOVA with only within subjects factors there are equal numbers of subjects in each combination of conditions (balance) so Type II and Type III SS are equal. This can also be seen by each within subject factor combination having its own error term. The SS will differ, however, in a repeated measures ANOVA if there is also a between subjects factor, B, as the SS for a within subjects factor, W1, will be changed by the presence of the W1 x B interaction as both share the same error term W1 x subjects. In this case type II SS will compute the SS(W1) ignoring the B x W1 interaction whereas the Type III SS for W1 will adjust for the presence of the B x W1 interaction. As described above Type II SS for W1 should be used if the B x W1 interaction is not statistically significant or, alternatively, the B x W1 interaction could dropped from the model by removing the between subjects factor (along the lines of MacNaughton (1998) below).

==========================

Reference

Langsrud, Ø (2003) ANOVA for Unbalanced Data: Use Type II Instead of Type III Sums of Squares, Statistics and Computing, 13, 163-167.

MacNaughton, DB (1998) What sums of squares are best in unbalanced analysis of variance. This is a pdf which is published on-line.

Nelder, JA (1994) The statistics of linear models: back to basics. Statistics and Computing 4 221-234. Nelder argues here in favour of using only Type I and Type II SS (ie NOT Type III which is the default in SPSS ANOVAs).