### SPSS script to Perform the Benjamini and Hochberg FDR procedure

This procedure may also be computed using the spreadsheet here. Keselman HJ, Cribbie R and Holland B (1999) compared several pairwise comparison tests and found the FDR approach to be the most powerful when more than 6 pairwise tests were performed e.g. when comparing all possible pairs of means for five groups. Benjamini and Hochberg (1995) also show FDR has higher power compared to other procedures including a p-value adjustment procedure due to Hochberg (1988) which, in turn, is shown as having higher power than both Bonferroni and Holm methods for multiple testing in repeated measures by Lix and Sajobi (2010).

The FDR (Benjamini and Hochberg, 1995) is a stepdown procedure using a prescribed alpha usually 0.05 for testing **in advance of the p-value testing** (see Ernst Wit's discussion in Benjamini (2010)) and so the above spreadsheet should be used for hypothesis testing of multiple p-values. See here for a warning about using *exact* p-values instead of statements of form *p<alpha*.

The exact p-value which produces specific p-values of form *p=* rather than *p<alpha* may, however, be computed using this spreadsheet. The exact p-value could be computed to assess sensitivity to see how close an adjusted p-value is to statistical significance.

In addition to the code below see also the padjust function in R of form

p.adjust(p, method = p.adjust.methods, n = length(p))

for example

p <- c(0.01,0.04) p.adjust(p,method="fdr",n=2)

References

Benjamini Y (2010). Discovering the false discovery rate. *Journal of the Royal Statistical Society B* **72(4)** 405-416.

Benjamini Y and Hochberg Y (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. *Journal of the Royal Statistical Society B* **57** 289-300.

Benjamini Y and Yekutieli D (2001). The control of the false discovery rate in multiple testing under dependency. *Ann. Statist.* **29** 1165-1188.

Hochberg Y (1988) A sharper Bonferroni procedure for multiple tests of significance. *Biometrika* **75** 800-803.

Keselman HJ, Cribbie R and Holland B (1999) The pairwise multiple comparison multiplicity problem: an alternative approach to familywise and comparisonwise type I error control. *Psychological Methods* **4(1)** 58-69.

Keselman HJ, Miller CW and Holland B (2011) Many tests of significance: new methods for controlling type I errors. *Psychological Methods* **16(4)** 420-431. (Looks at methods called k-FWER which control the probability of k or more false rejections to be 0.05 and contains R code for a variety of multiple comparisons procedures including FDR, k-FWER and 1-FWER (= Holm method which is computed here).

Lix LM and Sajobi T (2010) Testing multiple outcomes in repeated measures designs. *Psychological Methods* **15(3)** 268-280.

[COPY AND PASTE THE BELOW BOX SYNTAX INTO A SPSS SYNTAX WINDOW; SELECT ALL AND RUN. EDIT THE INPUT DATA AS REQUIRED]

DATA LIST FREE / pvals . BEGIN DATA 0.021 0.001 0.017 0.041 0.005 0.036 0.042 0.023 0.07 0.1 END DATA . * Calculate the number of p values. RANK PVALS /n into N. * N contains the number of cases in the file. * make a submacro to be invoked from the syntax. DO IF $CASENUM=1. WRITE OUTFILE 'C:\temp.sps' /"DEFINE !nbcases()"n"!ENDDEFINE.". END IF. EXE. COMPUTE alpha = 0.05 . SORT CASES BY pvals (A) . COMPUTE index = $CASENUM . EXECUTE . INCLUDE FILE='C:\temp.sps'. COMPUTE rank = index . COMPUTE ref = (alpha * rank) / !nbcases . COMPUTE diff = pvals - ref . COMPUTE comp = diff LT 0 . CREATE ccomp = CSUM(comp) . EXECUTE . MATRIX . GET rnk / VARIABLES = rank . GET pvals / VARIABLES = pvals . GET ccomp / VARIABLES = ccomp . GET alpha / VARIABLES = alpha . GET diff / VARIABLES = diff . COMPUTE ccmpmx = pvals LE (CCOMP>0)*CMAX(((CMAX(ccomp) &* (diff LE 0)) EQ ccomp) &* pvals) . SAVE {rnk, pvals, alpha, ccmpmx} / OUTFILE = * / VARIABLES = index,pvals,alpha,fdrsig . END MATRIX . SORT CASES BY index (A) . EXPORT OUTFILE = out . IMPORT FILE = out / KEEP = pvals alpha fdrsig . FORMAT pvals (F5.3) . FORMAT alpha (F5.3) . FORMAT fdrsig (F1.0) . VALUE LABELS fdrsig 1 'Sig' 0 'Nonsig' . SET RESULTS = LISTING . DO IF $CASENUM EQ 1 . PRINT EJECT /'P Value' 1 'FDR Criterion' 9 'FDR Significance' 24 . END IF . PRINT / pvals (T3 F5.3) alpha (T17, F5.3) fdrsig (T39, F1.0) . EXECUTE .

This produces the following output:

P Value FDR Criterion FDR Significance .001 .050 1 .005 .050 1 .017 .050 1 .021 .050 1 .023 .050 1 .036 .050 0 .041 .050 0 .042 .050 0 .070 .050 0 .100 .050 0

### R Code to Perform the Benjamini and Hochberg FDR procedure

[Insert P values in any order] pvalue <- c(0.021,0.001,0.017,0.041,0.005,0.036,0.042,0.023,0.07,0.1) sorted.pvalue<-sort(pvalue) sorted.pvalue j.alpha <- (1:10)*(.05/10) diff <- sorted.pvalue-j.alpha neg.diff <- diff[diff<0] pos.diff <- neg.diff[length(neg.diff)] index <- diff==pos.diff p.cutoff <-sorted.pvalue[index] p.cutoff p.sig <- pvalue[pvalue <= p.cutoff] p.sig

### R Output

> pvalue<-c(0.021,0.001,0.017,0.041,0.005,0.036,0.042,0.023,0.07,0.1) > sorted.pvalue <- sort(pvalue) > sorted.pvalue [sorted P values in ascending order] [1] 0.001 0.005 0.017 0.021 0.023 0.036 0.041 0.042 0.070 0.100 > j.alpha <- (1:10) * (0.05/10) > diff <- sorted.pvalue - j.alpha > neg.diff <- diff[diff < 0] > pos.diff <- neg.diff[length(neg.diff)] > index <- diff == pos.diff > p.cutoff <- sorted.pvalue[index] > p.cutoff [1] 0.023 > p.sig <- pvalue[pvalue <= p.cutoff] > p.sig [significant p values by B-H method]: [1] 0.021 0.001 0.017 0.005 0.023

A more elegant version of the above R code (by Adam Wagner) is given below which returns the maximum p-value which is below the threshold based on FDR.

# FDR implementation, adapted from Peter Watson's Stats wiki entry by Adam Wagner # http://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/FDR # fdrPValue # Inputs # pValues.....the P-values that we wish to apply the FDR correction # multiple correction to; can be a vector or matrix # alpha.....the size of alpha we wish to use; default=0.05 # Outputs.....the largest P-value that is still significant with FDR applied -or- # halts execution if no P-values are significant fdrPValue <- function(pValues, alpha=0.05){ sortedPValues <- sort(pValues) # Error catching - stop if there are no p-values smaller than alpha if (sum(sortedPValues<alpha)==0) stop(paste("There no p-values smaller than alpha at", alpha, "- FDR pointless")) # Calculate the "prototypical" pvalues protoP <- 1:length(sortedPValues)*(alpha/length(sortedPValues)) diff = sortedPValues-protoP # FDR P value corresponds to the largest observed P-value which is smaller than or equal # to its prototype # Error checking to deal with the case when no P-values are still significant if (sum(diff<0)) fdrP <- sortedPValues[which(diff==max(diff[diff<0]))] else stop("No P-Values remain significant with the application of FDR") fdrP } testMatrix <- matrix(c(0.021,0.001,0.017,0.041,0.005,0.036,0.042,0.023,0.07,0.1), nrow=2) fdrPValue(testMatrix) fdrPValue(testMatrix, alpha=0.009) # NB: The P-values do *not* need to be in matrix form - # that is simple used here for testing