Survival analysis sample size calculations

The total number of events may be evaluated for comparing hazard rates (per unit time) using this spreadsheet which uses a simple formula taken from Schoenfeld (1983), Hsieh and Lavori (2000) and Collett (2003, Chapter 10) corresponding to a group regression estimate (ratio of hazards) in a Cox regression model.

In particular from Schoenfeld (1983) the total number of events, d, required is

d = $$[ ( z(0.5a) + z(b) )2 ] / [ p(1-p)[log(hr)]2 ]$$

for a two-sided type I error, a, power 1-b, event rate in population p, hazard ratio, hr and z the Standard Normal (or probit) function.

Hsieh and Lavori (2000) further give sample size formulae for the number of deaths using continuous covariates in the Cox regression.

dc = $$[ (z(0.5a) + z(b) )2 ][ \sigma2 [log(hr)]2 ]$$

with $$\sigma2 $$ \mbox{equal to the variance of the covariate}.

dc2 = $$\frac{dc}{1-R^text{2}}$$ where $$R^text{2}$$ is the squared multiple correlation regression of the covariate of interest with the others in the case of more than one continuous covariate. This method is computed using this spreadsheet. The ratio for a continuous covariate could be comparing rates at one sd above the mean to that at the mean.

This approach is similar to Hsieh's approaches to sample size calculations for the odds ratio in a logistic regression (see here). This method may also be computed using the powerEpiCont function in R as illustrated here.

Alternatively the effect size can be expressed in terms of ratios of group survival rates as used by the power calculators given here, which has R code as given below, and here which uses results from Machin et al. (1997, 2009). The free downloadable software WINPEPI also computes this sample size and power for comparing survival curves. Both WINPEPI and Machin et al. give the same, or at least very similar, answers to those from Schoenfeld (1983) using the above formula and a correction factor converting the number of deaths to number of subjects required (See here).

The R code mentioned above for Machin et al. example taken from here where hr is the hazard ratio, hr0 is assumed initial hazard ratio e.g. hazard rate prior to intervention, pE is the overall probability of the event occurring within the study period, pA is the proportions of the sample size allotted to group 'A', alpha is type I error and beta is 1-power.

hr=2
hr0=1
pE=0.8
pA=0.5
alpha=0.05
beta=0.20
(n=((qnorm(1-alpha/2)+qnorm(1-beta))/(log(hr)-log(hr0)))^2/(pA*(1-pA)*pE))
ceiling(n) # 82
(Power=pnorm((log(hr)-log(hr0))*sqrt(n*pA*(1-pA)*pE)-qnorm(1-alpha/2)))

For the Collett example above we have hr=0.5729, pE=0.495, pA=0.5, alpha=0.05 and beta=0.1 which gives a total sample size of 274 (n) using the R code above which agrees with the sample size worked out by Collett in his example.

References

Collett D (2003) Modelling Survival Data in Medical Research. Second Edition. Chapman and Hall:London

Hsieh FY and Lavori PW (2000) Sample size calculations for the Cox proportional hazards regression models with nonbinary covariates Controlled Clinical Trials 21 552-560. A downloaded pdf of this paper is here.

Machin D, Campbell M, Fayers, P, Pinol A (1997) Sample Size Tables for Clinical Studies. Second Ed. Blackwell Science IBSN 0-86542-870-0 p. 176-177.

Machin D, Campbell MJ, Tan SB, Tan SH (2009) Sample size tables for clinical studies. 3rd ed. Chichester: Wiley-Blackwell.

Schoenfeld DA (1983) Sample size formulae for the proportional hazards regression model. Biometrics 39 499-503.