FAQ/RatCaseVar - CBU statistics Wiki

Revision 8 as of 2007-01-25 12:12:11

Clear message
location: FAQ / RatCaseVar

Generalizability and Sample Size

The below is taken from Hair et al. (1998).

In addition to is role in determining statistical power sample size also influences the generalizabilit of results by ratio of observations to predictor variables.

A general rule is that there should be at least five observations for each independent variable. Any ratio below this runs the risk of overfitting making the results too sample specific.

Ideally there should be 15 to 20 observations per independent variable. If a stepwise procedure is used it is recommended that there are 50 times more observations than variables (Wilkinson, L. (1975)).

The ratio of 5 to 1 applies to multiple regression, factor analysis and discriminant analyses including logistic regression. It is also recommended, additionally for discriminant analyses, that the smallest group size should exceed the number of predictor variables. Ideally each group should have at least 20 observations in it.

If the discriminant group sizes vary markedly then there can be a tendency to classify a disporoptionately large number of observations into the large sized groups.

In factor analysis sample size can also influence the thresholds for

  • [:FAQ/patternMatrix:determining what is a high factor loading.]

References

Hair Jr, JF, Anderson, RE, Tatham, RL and Black WC (1998) Multivariate data analysis fourth edition. Prentice-Hall:London.

Wilkinson L. (1975) Tests of significance in stepwise regression Psychological bulletin 86 168-74.