Diff for "FAQ/Simon" - CBU statistics Wiki
location: Diff for "FAQ/Simon"
Differences between revisions 41 and 42
Revision 41 as of 2014-07-30 09:22:49
Size: 5019
Editor: PeterWatson
Comment:
Revision 42 as of 2014-07-30 09:23:26
Size: 5024
Editor: PeterWatson
Comment:
Deletions are marked like this. Additions are marked like this.
Line 20: Line 20:
[[attachment:kurtosis.pdf | These lecture notes]] on page 12 give the +/- 3 rule of thumb for kurtosis cut-offs. [[attachment:kurtosis.pdf | These lecture notes]] on page 12 also give the +/- 3 rule of thumb for kurtosis cut-offs.

Testing normality including skewness and kurtosis

High levels of skewness (symmetry) and kurtosis (peakedness) of regression/ANOVA model residuals (which may be saved in SPSS) are not desirable and can undermine these analyses. SPSS gives these values (see CBSU Stats methods talk on exploratory data analysis). Steve Simon (see here) gives some sound advice on checking normality assumptions including rules of thumb on just how large skew and kurtosis must be to start worrying about doing statistical analyses. His main points in this e-mail to the SPSSX-L mailing list of 3rd December 2009 are reproduced below:

  • There are no official rules about cut-off criteria to decide just how large skew or kurtosis values must be to indicate non-normality.
  • Avoid using a test of significance, because it has too much power when the assumption of normality is least important and too little power when the assumption of normality is most important.
  • I generally don't get too excited about skewness unless it is larger than +/- 1 or so.

[Note: Hair Jr, JF, Anderson, RE, Tatham, RL, Black WC (1998) Multivariate Data Analysis Fifth Edition. Prentice-Hall:New Jersey give the same cutoffs for skewness].

  • Streiner and Norman (1995), in the book "Health Measurement Scales" suggest that if 80%+ of individuals are responding at one end of the scale, you have a problem, otherwise, it doesn't matter.
  • SPSS defines kurtosis in a truly evil way by subtracting 3 from the value of the fourth central standardized moment. A value of 6 or larger on the true kurtosis (or a value of 3 or more on the perverted definition of kurtosis that SPSS uses) indicates a large departure from normality. Very small values of kurtosis also indicate a deviation from normality, but it is a very benign deviation. This indicates very light tails, as might happen if the data is truncated or sharply bounded on both the low end and the high end.
  • Don't let skewness and kurtosis prevent you from also graphically examining normality. A histogram and/or a Q-Q plot are very helpful here.

What about if most the variables that I have are normal and a few of them are not? In this case, is it possible to use the parametric tests?

These lecture notes on page 12 also give the +/- 3 rule of thumb for kurtosis cut-offs.

Curran et al. (1996) suggest moderate normality thresholds of 2.0 and 7.0 for skewness and kurtosis respectively when assessing multivariate normality which is assumed in factor analyses and MANOVA. There is also a graphical method assessing multivariate normality plotting the j-th ordered mahalanobis distances of p variables against the (j-0.5)/p quantile of a chi-square distribution. If the variables have a multivariate normal distribution the plot will form a line. Details are in Stevens (2001) and Johnson and Wichern's 3rd edition (1992) and also in both the appendix of DeCarlo's 1997 paper (pdf paper version here) which contains a SPSS macro and a less flexible non-macro form in Burdenski (2000) (paper in pdf format is here) which also contains SPSS code. A SPSS macro version of Burdenski SPSS syntax is given here. Both DeCarlo and Burdenski SPSS syntax plot the ordered Mahalanobis distances against chi-square quantiles as mentioned above by Simon as a graphical test of multivariate Normality. Best fitting lines can be added to the scatterplots (see here for how to do this in SPSS) to help assess linearity.

References

Burdenski, T. (2000). Evaluating univariate, bivariate, and multivariate Normality using graphical and statistical procedures. Multiple Linear Regression Viewpoints, 26(2), 15-28.

Curran, P. J., West, S. G., & Finch, J. F. (1996). The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. Psychological Methods, 1(1), 16-29.

Johnson, R.A. and Wichern, D.W. (1992). Applied Multivariate Statistical Analysis. 3rd Edition. Prentice-Hall: Englewood Cliffs, New Jersey.

Johnson, R.A. and Wichern, D.W. (2007). Applied Multivariate Statistical Analysis. 6th Edition. Pearson: New Jersey.

Looney S.W. (1995). How to use tests for univariate normality to assess multivariate normality. American Statistician 49(1) 64-70.

Stevens J.P. (2001). Applied Multivariate Statistics for the Social Sciences (Applied Multivariate STATS) Psychology Press:London.

Streiner D.L. and Norman G.R. (1995). Health Measurement Scales. A practical guide to their development and use. 2nd Edition. Oxford Medical Publications, Inc.

None: FAQ/Simon (last edited 2018-08-14 09:28:35 by PeterWatson)