FAQ/kw - CBU statistics Wiki
Self: FAQ/kw

Post-hoc nonparametric pairwise comparisons between groups

As of Release 18, the new NPTESTS procedure offers Dunn's post hoc tests for the Kruskal-Wallis omnibus test. In the menus, select Analyze>Nonparametric Tests>Independent Samples. You will get a Kruskal-Wallis test and post hoc tests automatically if the omnibus test is significant if you specify a grouping variable with more than two levels and one or more test or response variables that are defined as having measurement scale levels of Scale. Field (2013) gives some examples of post-hocs in SPSS.

No procedure exists, however, in SPSS (upto version 17) for performing post-hoc pairwise group comparisons when the Kruskal-Wallis nonparametric test is used. Kruskal-Wallis compares three or more groups and is the nonparametric analogue of the on-way anova. It is recommended by Lantz (2013) for analysis of non-Normal group data.

The below syntax and this EXCEL spreadsheet implement procedures for performing nonparametric post-hoc tests on independent samples suggested by Sprent and Smeeton (2001) and Conover (1999). To guard against false positive results Sprent and Smeeton suggest only specific comparisons should be tested and only if the overall Kruskal-Wallis test is significant (as in the LSD approach for t-tests). The results from this method are very similar to the method of Conover(1999). Both these methods are liberal in detecting pairwise group differences. The spreadsheet performs all possible pairwise post-hoc comparisons for upto 500 observations in total and upto 10 groups.

As a further precaution Bonferroni, Sidak, Holm or Ryan adjustments to outputted p-values could be used. In particular other authors suggest using these corrections on pairwise Mann-Whitney tests. For further details on these adjustments see the Graduate Statistics Seminar on post-hoc tests.

Sokal and Rohlfe (1995) proposed a test of all pairwise post-hoc comparisons using a Tukey correction which, is consequently, much more conservative than the Conover and Sprent and Smeeton tests which are better for subsets of pairwise comparisons.

* Example data set below

data list free/
 y group.

begin data
9 1
9 1
1 1
2 1
4 2
5 2
7  2
12 2
6 3
7 3
9 3
8 3
6 3
2 3
5 3
end data.

After opening a spreadsheet containing y (response) and group columns you can run the SPSS Macro syntax below: [COPY AND PASTE INTO A SPSS SYNTAX WINDOW, SELECT ALL AND RUN; SPECIFY TWO GROUPS TO BE COMPARED IN LAST LINE]

* performs Sprent & Smeeton (2001) method of 
* pairwise comparisons for kruskal-wallis test
* Data file containing just the two columns
* called y and group
* inputs are specified groups to being compared 
* This test is only valid if the overall 
* Kruskal-Wallis test is statistically
* significant
* results are saved to output.sav which appears
* in My Documents folder and are outputted
* in SPSS output window
* The Y and Group columns are backed up in a
* file called input.sav located in the My
* Documents folder
* reference Sprent, P. and Smeeton, NC (2001)
* Applied Nonparametric Statistical Methods.
* Chapman and Hall:London.

* inputs groups to be compared 

set errors=none.
set mprint=off.

define !kwpairs ( !pos !tokens(1)
                       / !pos !tokens(1)).

save file=input.sav /keep=y group.

compute rysq=ry*ry.

aggregate outfile=* 
 /sumy sumysq= sum(ry rysq)

compute mrk = sumy/nsize.
compute suma = sumy*sumy/nsize.
compute con=1.

* input which two groups to be compared

if (group=!1) m1=sumy/nsize.
if (group=!2) m2=sumy/nsize.
if (group=!1) nx=nsize.
if (group=!2) ny=nsize.

aggregate outfile=* 
 /con2 ssq= sum(suma sumysq)
 /mn1 mn2 = first(m1 m2)
 /n1 n2 = first(nx ny)

compute #con1=ntot*(ntot+1)*(ntot+1)/4.
compute #con3= (ntot-1)*(con2-#con1).
compute #con4= #con3/(ssq-#con1).
compute #con5=(ntot-1-#con4)*(n1+n2).
compute con6=(ssq-#con1)*(#con5).
compute con7=(n1*n2)*(ntot-ng)*(ntot-1).

* Just use two-tailed uncorrected t
* Given overall K-Wallis is statistically
* significant ie LSD equivalent

compute twot=2.

compute cd025= mn1-mn2 + idf.t(0.05/twot,ntot-ng)*sqrt(con6/con7).
compute cd975= mn1-mn2 - idf.t(0.05/twot,ntot-ng)*sqrt(con6/con7).
compute pv = cdf.t(-abs(mn1-mn2) / sqrt(con6/con7),ntot-ng).

save outfile=output.sav /keep = mn1 mn2 cd025 cd975 pv.

get file=output.sav.
compute odiff=abs(mn1 - mn2).

formats all (f11.8).
variable labels mn1 'Mean Rank Group 1' /mn2 'Mean Rank Group 2' / odiff 'Observed difference' /pv ' Uncorrected Two-tailed P-Value'
         / cd025 '95% CI L ' / cd975 '95% CI U'.

report format=list automatic align(center)
  /variables=mn1 mn2 odiff pv cd025 cd975
  /title "Pairwise comparisons using method"
         "of Sprent and Smeeton (2001) for Kruskal-Wallis test"
         " 95% Confidence Interval for difference in means ranks in (L,U)"
         "                                                                                          "
         " NB: This test is only valid if the Kruskal-Wallis test is statistically"
         " significant overall using all the groups".



!kwpairs 2 3.


Conover WJ (1999) Practical nonparametric statistics. 3rd Edition. Wiley:New York.

Field A (2013) Discovering statistics using IBM SPSS Statistics. Fourth Edition. Sage:London.

Lantz B (2013) The impact of sample non-normality on ANOVA and alternative methods. British Journal of Mathematical and Statistical Psychology 66(2) 224-244. This paper recommends the use of Kruskal-Wallis tests over other tests as it is particularly sensitive to picking up group differences between non-Normal populations.

Sokal RR & Rohlf FJ (1995) Biometry:the principles and practice of statistics in biological research. 3rd Edition. WH Freeman:New York.

Sprent P & Smeeton NC (2001) Applied nonparametric statistical methods. 3rd Edition. Chapman and Hall:London.

None: FAQ/kw (last edited 2018-01-05 12:45:54 by PeterWatson)