SPSS macro version of Budenski's program for a graphical plot to assess multivariate Normality
The !MAHPLO macro below plots Mahalanobis distances of each observation in a particular group against its hypothesised chi-square value assuming multivariate Normality. The plot should form a straight line.
DEFINE !MAHPLO (NOBS !TOKENS(1) /gp !tokens(1) /gpval !tokens(1) /vars !cmdend). list variables=all/cases=9999/format=numbered . COMMENT 'y' is a variable automatically created by the program, COMMENT and does not have to modified for different data sets. select if (!gp eq !gpval) . compute y=$casenum . print formats y(F5) . regression variables=y !vars/ descriptive=mean stddev corr/ dependent=y/enter !vars/ save=mahal(mahal) . sort cases by mahal(a) . execute . list variables=y !vars mahal/cases=9999/format=numbered . COMMENT In the next TWO lines, for a given data set put the COMMENT actual n in place of the number '41' used for the COMMENT example data set. loop #i=1 to !NOBS . compute p=($casenum - .5) / !NOBS. COMMENT In the next line, change '3' to whatever is the number COMMENT of variables. COMMENT The p critical value of chi square for a given case COMMENT is set as [the case number (after sorting) - .5] / the COMMENT sample size]. if (!gp eq !gpval) chisq=idf.chisq(p,3) . end loop . print formats p chisq (F8.5) . list variables=y p mahal chisq/cases=9999/format=numbered . GRAPH /SCATTERPLOT(BIVAR)=mahal WITH chisq /MISSING=LISTWISE . !ENDDEFINE.
The !MAHPLO macro is run using the call below. There are four inputs: the number of observations in the group (NOBS), the name of the group variable (gp), the value or name of the group (gpval) and the variables that are to be tested (vars). In the SPSS data example given here we wish to test the multivariate Normality of data comprising 26 observations, for a particular group (group=2) based on three variables, x1 to x3. This is actually Fisher's iris data with a 'group' column added to illustrate within group testing. The resulting plot shown here resembles a straight line (R-squared of 0.94) suggesting the three variables form a multivariate Normal distribution for females.
!MAHPLO NOBS=26 gp=group gpval=2 vars=x1 x2 x3.
Reference
Burdenski, T. (2000). Evaluating univariate, bivariate, and multivariate Normality using graphical and statistical procedures. Multiple Linear Regression Viewpoints, 26(2), 15-28.