Performing randomisation tests using nonparametric methods

Randomisation tests (see for example Edgington (2007)) are tests where a particular test statistic, like a sum or a correlation, for observed data is compared to all possible values of this statistic that is calculated from all possible permutations of the data. This generates a one or two-tailed probability (p-value) which is the probability of observing an effect at least as extreme as that observed in the data. This may be approximated by taking a large enough number of samples (eg 10000) from the data (also available using the Monte-Carlo option in certain SPSS nonparametric procedures). For further examples and R code for permutation tests applied to analysis of variance see here. The contents of this webpage are also reproduced here.

This test has the advantage of not assuming a particular distribution in the data and can be performed with relatively small amounts of data.

Some of these tests can be performed using the exact option in nonparametric tests in SPSS see e.g. Dugard, File and Todman (2012) who also recommend their use in single case studies. In particular they state that randomization tests do not need to assume that observations are independent. They also say that 'large auto-correlations among neighboring observations will reduce the chance of detecting a treatment effect'. Further Cannon, Warner, Taddei and Kleinbaum (2001) show that parameter variances in tests which ignore correlations in the data are biased upwards giving inflated p-values compared to tests that correctly model these correlations.

An example of its use is given below which may be obtained using the exact option for 2 independent (unrelated) samples in SPSS.

Score		Group
1		1
2		1
3		2
4		2
5		2
6		2
7		2

We are interested in seeing if the two groups differ on their ranked scores. To do this the Mann-Whitney test uses the sum of the observation ranks in one of the groups and compares this sum to all possible sums that could result from two groups of sizes 2 and 5.

In the above example we have the most extreme scenario where the two smallest observed scores are both in the two observation group. There are 21 possible ways of obtaining two ranks from 7 observations (assuming no ties) and none of these have rank sum less than that observed (1+2=3). The 21 possible pairs are given below.

Group Ranks (N=2)		Rank sum
1,2		3
1,3		4
1,4		5
1,5		6
1,6		7
1,7		8
2,3		5
2,4		6
2,5		7
2,6		8
2,7		9
3,4		5
3,5		8
3,6		9
3,7		10
4,5		9
4,6		10
4,7		11
5,6		11
5,7		12
6,7		13

So the (one-tailed) p-value or probability, of observing a rank sum at least as low as that in the data given we have two groups of sizes 2 and 5 (as observed in the data) is (1/21) = 0.048.

We can get this result by putting this data into SPSS and choosing exact under analyze:nonparametric tests:exact. We expect apriori the group with two elements to have the lower values hence the p-value is one-tailed. If we are not sure of the direction of group difference we just double the one-tailed p-value and get a two-tailed p-value of 0.096.

In R defining (unpaired) groups, x and y:

x <- c(1,2)
y <- c(3,4,5,6,7)

then running

 wilcox.test(x,y,exact=TRUE,alternative="less")

gives the same result we computed "by hand" and obtained in SPSS.

        Wilcoxon rank sum test

data:  x and y 
W = 0, p-value = 0.04762
alternative hypothesis: true mu is less than 0

Note that Owora et al. (2022) comment that by chance randomisation can exhibit imbalances within randomly allocated groups especially in small groups.

References

Cannon, MJ, Warner, L, Taddei, JA and Kleinbaum, DG (2001) What can go wrong when you assume that correlated data are independent: an illustration from the evaluation of a childhood health intervention in Brazil. Statistics in Medicine 20 1461-1467.

Dugard, P, File, P and Todman, J (2012) Single-case and small-n experimental designs: a practical guide to randomization tests Second edition. Routledge:Hove.

Edgington, ES and Onghena, P (2007) Randomisation tests:fourth edition. CRC Press:London.

Owora, AH, Dawson, J, Godbury, G, Mestre, L, Pavela G, Mehta T, Vorland CJ, Xun P and Allison DB (2022) Randomisation can do may things - but it cannot "fail" Significance 19(1) 20-23.

MRC CBU Wiki

Quick Links

Search Wiki

Page Tools

Upload page content

Performing randomisation tests using nonparametric methods