2694
Comment:
|
2706
|
Deletions are marked like this. | Additions are marked like this. |
Line 73: | Line 73: |
print(res) |
Combining p-values by Fisher's method and Stouffer's method
Combining p-values by Fisher's method
The basic idea is that if $$p_i (i=1 \ldots n)$$ are the one-sided $$p$$-values for $$n$$ independent statistics then $$-2 \sum\log(p_i)$$ is a $$\chi^2(2n)$$ statistic which reflects whether the combined $$p$$-values are smaller than would be expected if they were Uniform(0,1) variates.
The following MATLAB code evaluates this statistic and its p-value.
function p = pfast(p) % Fisher's (1925) method for combination of independent p-values % Code adapted from Bailey and Gribskov (1998) product=prod(p); n=length(p); if n<=0 error('pfast was passed an empty array of p-values') elseif n==1 p = product; return elseif product == 0 p = 0; return else x = -log(product); t=product; p=product; for i = 1:n-1 t = t * x / i; p = p + t; end end
Let's try it out:
>> pvals=[0.1 0.01 0.01 0.7 0.3 0.1]; >> pfast(pvals) ans = 0.0021
I.e. the combined p-value is 0.0021 for this array of 6 $$p$$-values.
Further investigations suggest that Fisher's method has inappropriate behaviour. [examples to be included]
Combining p-values by Stouffer's method
function pcomb = stouffer(p) % Stouffer et al's (1949) unweighted method for combination of % independent p-values via z's if length(p)==0 error('pfast was passed an empty array of p-values') pcomb=1; else pcomb = (1-erf(sum(sqrt(2) * erfinv(1-2*p))/sqrt(2*length(p))))/2; end
Note for use in R one needs to define
erf <- function(x) 2 * pnorm(2 * x/ sqrt(2)) - 1 erfinv <- function(x) qnorm( (x+1)/2 ) / sqrt(2) pcomb <- function(p) (1-erf(sum(sqrt(2) * erfinv(1-2*p))/sqrt(2*length(p))))/2 pl <- NA pl <- length(p) { if (is.na(pl)) { res <- "pfast was passed an empty array of p-values"} else res <- pcomb(p) } print(res)
A [attachment:combinedp.xls spreadsheet] will also compute Fisher's and Stouffer's combined p.
References
Bailey TL, Gribskov M (1998). Combining evidence using p-values: application to sequence homology searches. Bioinformatics, 14 (1) 48-54.
Fisher RA (1925). Statistical methods for research workers (13th edition). London: Oliver and Boyd.
Stouffer, Samuel A., Edward A. Suchman, Leland C. DeVinney, Shirley A. Star, and Robin M. Williams, Jr. (1949) Studies in Social Psychology in World War II: The American Soldier. Vol. 1, Adjustment During Army Life. Princeton: Princeton University Press.