A single predictor in a multiple binary logistic regression

A power calculator is given here for upto two binary covariates using Demidenko (2007, 2008) and were here although it is anticipated this page could reappear pending a redesign of the http://biostat.hitchcock.org/ website (April 2017). An example of using Demidenko's programme is here.

Hsieh FY (1989) gives formulae and tables for odds ratios between 0.6 and 3.0 and ONE-tailed type I error to compute total sample size for given power in a multiple binary logistic regression for continuous covariates associated with a change of 1 sd in the value of the covariate. These calculations can be done using a spreadsheet.

Hsieh, Bloch and Larsen (1998) showed that sample size for multiple logistic regression predictors could be approximated using t-tests for a single predictor. A spreadsheet to compute sample size using their approach for a binary covariate is given here and for a continuous covariate is here. The odds ratio in the latter is that associated with an increase of one standard deviation in the covariate.

(The following is reproduced from this website). A less complex approach is based upon the work of Peduzzi et al. (1996) who offer the following guideline for a minimum number of cases to include in your study.

Let p be the smallest of the proportions of negative or positive cases in the population and k the number of covariates (the number of independent variables), then the minimum number of cases to include is:

N = 10 k / p

For example: you have 3 covariates to include in the model and the proportion of positive cases in the population is 0.20 (20%). The minimum number of cases required is

N = 10 x 3 / 0.20 = 150

If the resulting number is less than 100 you should increase it to 100 as suggested by Long (1997).

Whitaker HJ, Farrington, CP, Spiessens B and Musonda P (2005) give simple sample size formulae for an odds which takes the length of the risk period into account e.g. what fraction of the observation period you are exposed to an adverse event (see also Musonda et al. (2005) here.)

References

Demidenko E (2007) Sample size determination for logistic regression revisited. Statistics in Medicine 26 3385-3397.

Demidenko E (2008) Sample size and optimal design for logistic regression with binary interaction. Statistics in Medicine 27 36-46.

Hsieh FY (1989) Sample size tables for logistic regression. Statistics in Medicine 8 795-802.

Hsieh, FY, Block, DA, and Larsen, MD (1998). A Simple Method of Sample Size Calculation for Linear and Logistic Regression. Statistics in Medicine 17 1623-1634. (Pdf file taken from here).

Long, JS (1997). Regression Models for categorical and limited dependent variables. Thousand Oaks, CA: Sage Publications.

Musonda P, Farrington CP and Whitaker HJ. (2005) Sample sizes for self-controlled case series studies. Research report:The Open University.

Peduzzi, P, Concato, J, Kemper, E, Holford, TR, Feinstein, AR (1996). A simulation study of the number of events per variable in logistic regression analysis. Journal of Clinical Epidemiology 49 1373-1379.

Whitaker HJ, Farrington, CP, Spiessens B and Musonda P (2005). Tutorial in biostatistics: the self-controlled case series method. Statistics in Medicine 24 4035-4044.

MRC CBU Wiki

Quick Links

Search Wiki

Page Tools

A single predictor in a multiple binary logistic regression