Diff for "FAQ/Simon" - CBU statistics Wiki
location: Diff for "FAQ/Simon"
Differences between revisions 25 and 26
Revision 25 as of 2011-01-12 12:47:19
Size: 3831
Editor: PeterWatson
Comment:
Revision 26 as of 2011-01-12 12:51:33
Size: 4010
Editor: PeterWatson
Comment:
Deletions are marked like this. Additions are marked like this.
Line 18: Line 18:
Curran et al. (1996) suggest moderate normality thresholds of 2.0 and 7.0 for skewness and kurtosis respectively when assessing multivariate normality which is assumed in factor analyses and MANOVA. There is also a graphical method assessing multivariate normality plotting the j-th ordered mahalanobis distances of p variables against the (j-0.5)/p quantile of a chi-square distribution. If the variables have a multivariate normal distribution the plot will form a line. Details are in Stevens (2001) and Johnson and Wichern's 3rd edition (1992) and also in both the appendix of DeCarlo's(1997) paper ([:FAQ/mvnormc: pdf version here]) which contains a SPSS macro and [attachment:mahalplot.pdf this pdf attachment] which also contains (non-macro) SPSS code which both plot the ordered Mahalanobis distances against chi-square quantiles as mentioned above by Simon. Curran et al. (1996) suggest moderate normality thresholds of 2.0 and 7.0 for skewness and kurtosis respectively when assessing multivariate normality which is assumed in factor analyses and MANOVA. There is also a graphical method assessing multivariate normality plotting the j-th ordered mahalanobis distances of p variables against the (j-0.5)/p quantile of a chi-square distribution. If the variables have a multivariate normal distribution the plot will form a line. Details are in Stevens (2001) and Johnson and Wichern's 3rd edition (1992) and also in both the appendix of DeCarlo's(1997) paper ([:FAQ/mvnormc:pdf paper version here]) which contains a SPSS macro and [attachment:mahalplot.pdf this pdf attachment] which also contains (in less flexible non-macro form) SPSS code which both plot the ordered Mahalanobis distances against chi-square quantiles as mentioned above by Simon as a graphical test of multivariate Normality. Best fitting lines can be added to the scatterplots (see [:FAQ/regline:here] for how to do this in SPSS).

Testing normality including skewness and kurtosis

High levels of skewness (symmetry) and kurtosis (peakedness) of regression/ANOVA model residuals (which may be saved in SPSS) are not desirable and can undermine these analyses. SPSS gives these values (see CBSU Stats methods talk on [http://www.mrc-cbu.cam.ac.uk/Statistics/Resources/Lectures2005/2-eda-PW.ppt exploratory data analysis]). [http://www.childrens-mercy.org/stats/ Steve Simon] gives some sound advice on checking normality assumptions including rules of thumb on just how large skew and kurtosis must be to start worrying about doing statistical analyses. His main points are reproduced below:

  • There are no official rules about cut-off criteria to decide just how large skew or kurtosis values must be to indicate non-normality.
  • Avoid using a test of significance, because it has too much power when the assumption of normality is least important and too little power when the assumption of normality is most important.
  • I generally don't get too excited about skewness unless it is larger than +/- 1 or so.

[Note: Hair Jr, JF, Anderson, RE, Tatham, RL, Black WC (1998) Multivariate Data Analysis Fifth Edition. Prentice-Hall:New Jersey give the same cutoffs for skewness].

  • SPSS defines kurtosis in a truly evil way by subtracting 3 from the value of the fourth central standardized moment. A value of 6 or larger on the true kurtosis (or a value of 3 or more on the perverted definition of kurtosis that SPSS uses) indicates a large departure from normality. Very small values of kurtosis also indicate a deviation from normality, but it is a very benign deviation. This indicates very light tails, as might happen if the data is truncated or sharply bounded on both the low end and the high end.
  • Don't let skewness and kurtosis prevent you from also graphically examining normality. A histogram and/or a Q-Q plot are very helpful here.

[:FAQ/Simon/question: What about if most the variables that I have are normal and a few of them are not? In this case, is it possible to use the parametric tests?]

Curran et al. (1996) suggest moderate normality thresholds of 2.0 and 7.0 for skewness and kurtosis respectively when assessing multivariate normality which is assumed in factor analyses and MANOVA. There is also a graphical method assessing multivariate normality plotting the j-th ordered mahalanobis distances of p variables against the (j-0.5)/p quantile of a chi-square distribution. If the variables have a multivariate normal distribution the plot will form a line. Details are in Stevens (2001) and Johnson and Wichern's 3rd edition (1992) and also in both the appendix of DeCarlo's(1997) paper ([:FAQ/mvnormc:pdf paper version here]) which contains a SPSS macro and [attachment:mahalplot.pdf this pdf attachment] which also contains (in less flexible non-macro form) SPSS code which both plot the ordered Mahalanobis distances against chi-square quantiles as mentioned above by Simon as a graphical test of multivariate Normality. Best fitting lines can be added to the scatterplots (see [:FAQ/regline:here] for how to do this in SPSS).

  • [:FAQ/mvnormc: More on multivariate Normality testing (useful for MANOVA).]

References

Curran, P. J., West, S. G., & Finch, J. F. (1996). The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. Psychological Methods, 1(1), 16-29.

Johnson, R.A. and Wichern, D.W. (1992). Applied Multivariate Statistical Analysis. 3rd Edition. Prentice-Hall: Englewood Cliffs, New Jersey.

Johnson, R.A. and Wichern, D.W. (2007). Applied Multivariate Statistical Analysis. 6th Edition. Pearson: New Jersey.

Looney S.W. (1995). How to use tests for univariate normality to assess multivariate normality. American Statistician 49(1) 64-70.

Stevens JP (2001). Applied Multivariate Statistics for the Social Sciences (Applied Multivariate STATS) Psychology Press:London.

None: FAQ/Simon (last edited 2018-08-14 09:28:35 by PeterWatson)