FAQ/pvs - CBU statistics Wiki
location: FAQ / pvs

Adjusted p-values in SPSS and R

Howell DC (1992, 1997, 2002) describes various ways of adjusting to uncorrected p-values based on comparing all possible pairs of repeated measures group means (see here). Howell, in particular, recommends and describes using the SPSS macros rmpost1.sps and rmpost2.sps first written by David Nichols of SPSS which uses the Bonferroni correction to control type I error when performing multiple t-tests between the groups in repeated measures.

This syntax then be run using the syntax below and substituting as many variable names as needed for the repeated measures levels called 'Reading', 'memory', 'attentin' and 'speech' in the below.

include rmpostb.sps.
rmpost var=Reading memory attentin speech /alpha = .05.

An adapted form of an SPSS macro which additionally also does Holm and Sidak variants is also written below. In particular Holm-Bonferroni method is recommended for multiple testing of several correlations from the same matrix by Larzelere and Mulaik (1977) and Howell (2002;pages 388-390). Other work (see here) suggests the Holm-Bonferroni may be used for correlation matrices less than 15 by 15 in size. There are, however, problems using Bonferroni methods and so a Holm-Sidak approach is available outputted as downsidak in the macro below. Both Holm-Bonferroni (recommended here) and Holm-Sidak stepdown methods, and a method gaining popularity in imaging studies, the FDR method, may also be performed using a spreadsheet or with R. Klockars, Hancock and McAweeney (1995) show that Holm procedures which use different (weighted) significance levels for observed p-values have greater power to detect a variety of post-hoc differences than the Bonferroni approach which uses the same (unweighted) cut-off for significance for all the p-values. The p.adjust procedure in R adjusts a set of p-values for a variety of methods.

Lix and Sajobi (2010) say that the above FDR approach (Benjamini and Hochberg, 1995) is more powerful than both the Bonferroni method and that of Hochberg (1988) particularly as the number of tests increases and, also, controls the familywise error rate 'in a weak sense'. They recommend the Hochberg (1988) method for use in comparing post-hoc tests in repeated measures since it has good power and indirectly, therefore, also the FDR since it is more powerful.

Keselman et al (2011, 2012) compare FDR with a range of methods which control the family wise error rate (termed k-FWER) capping the chance of making no more than k false rejections at 5% and find FDR and the Sarkar method as more powerful. Note: Holm's method fitted below corresponds to 1-FWER allowing no more than one false rejection.

Note that Howell DC (1997, p.351) states that it is not necessary to adjust for post-hoc tests (or even to have an overall statistically significant F test) if you are interested in testing a specific comparison - “Current thinking and the logic behind most of our post-hoc tests, however does not require overall significance before making specific comparisons".

There is also a multcompare procedure in MATLAB which compares all pairs of group means using the Tukey-Kramer test.


Benjamini Y & Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society 57 289-300.

Howell (1997) Statistical methods for psychology. Fourth Edition. Wadsworth:Belmont,CA.

Howell (2002) Statistical methods for psychology. Fifth Edition. Wadsworth:Pacific Grove:CA.

Keselman HJ, Miller CW and Holland B (2011) Many tests of significance: new methods for controlling type I errors. Psychological Methods 16(4) 420-431. Note: Errata in the R codes listed in this paper for evaluating commonly used post-hoc tests are given in Keselman (2012).

Keselman HJ, Miller CW and Holland B (2012) Many tests of significance: new methods for controlling type I errors: correction to Keselman et al. (2011). Psychological Methods 17(4) 679.

Klockars AJ, Hancock GR, McAweeney MJ (1995) Power of Unweighted and Weighted Versions of Simultaneous and Sequential Multiple-Comparison Procedures Psychological Bulletin 300-307.

Larzelere RE and Mulaik SA (1977) Single-sample tests for many correlations. Psychological Bulletin 84 557-569.

Lix LM and Sajobi T (2010) Testing multiple outcomes in repeated measures designs. Psychological Methods 15(3) 268-280.


* enter a column of pvalues and this macro will
* adjust for the number
* in the column. The Ryan and Einot and Gabriel
* methods are for pairwise
* comparisons of group locations (e.g. means, 
* mean ranks) with a step size of abs(j - i)+ 1 
* where the higher of the two means has an 
* overall rank of j and the lower overall 
* rank, i.

* SPSS uses REGWQ to compute this for pairwise
* comparison of group means in univariate 
* for between subs factors
* Could be applied to p-values from ANY
* procedure e.g. nonparametrics as just uses 
* p-value and number of comparisons

* Create a dataset with all uncorrected 
* p-values and 
* step = abs(difference in ranks of group 
* locations) + 1.
* adjust data input below as required.
* If interested ONLY in Holm and Sidak methods 
* put step = 1 for all inputted p-values.

* The program creates a file called temp.sps
* in My Documents folder which may be deleted
* after running the macro

* -99 in the output for Holm and Sidak 
* procedures indicates the pairwise comparison
* is not tested and deemed nonsignificant
* because the previous comparison was 
* nonsignificant (p=0.05, by default)
* this may be changed by changing last line
* in this box

DATA LIST list / PVAL(f9.3) STEP (f2.0).
0.266 2
0.139 3
0.016 2

set errors=none.
set mprint=off.


* Calculate the number of p values.
RANK !PVALUE /n into N.
* N contains the number of cases in the file.
* make a submacro to be invoked from the syntax.

WRITE OUTFILE 'C:\Documents and Settings\peterw\My Documents\temp.sps' /"DEFINE !nbcases()"n"!ENDDEFINE.".

INCLUDE FILE='C:\Documents and Settings\peterw\My Documents\temp.sps'.
/* The number of cases in the file is now accessible using !nbcases */.

COMPUTE bonferr=!PVALUE*!nbcases.
IF (bonferr>1) bonferr=1.
COMPUTE sidak=1-(1-!PVALUE)**!nbcases.
COMPUTE holm=(!nbcases-pos+1)*!PVALUE.
IF (LAG(holm,1)>!ALP | LAG(holm,1)=-99) holm=-99.
COMPUTE downsidk=1-(1-!PVALUE)**(!nbcases-pos+1).
IF (LAG(downsidk,1)>!ALP | LAG(downsidk,1)=-99) downsidk=-99.
COMPUTE ryan=!PVALUE*!nbcases/!STEP.
IF (ryan>1) ryan=1.
COMPUTE eingab=1-(1-!PVALUE)**(!nbcases/!STEP).
IF (eingab>1) eingab=1.
FORMAT bonferr to downsidk ryan eingab (f7.3).
VARIABLE LABELS !PVALUE 'Original' /bonferr '1-step Bonferroni'
 /sidak '1-step Sidak' /holm 'Step-down Holm`s'/ downsidk 'Step-down Sidak' 
/ryan 'Ryan'  /eingab 'Einot & Gabriel' /STEP 'Step'.

  /VARIABLES=!PVALUE bonferr sidak holm downsidk !STEP ryan eingab 
  /TITLE "Original and adjusted p-values".

* changing the value of alp to re-specify
* significance level

None: FAQ/pvs (last edited 2017-05-10 08:45:28 by PeterWatson)