location: FAQ / overdis

Handling overdispersion in analysis of count data

Agresti (1996) mentions that the standard errors in a logistic (0/1 data) or poisson regression (counts) will be underestimated when there is extra variation in the data to that expected. This is due to the fact that both the binomial and poisson distributions, unlike the normal distribution for example, do not have a separate parameter to estimate the variance of the response.

Overdispersion can be detected by comparing the model fit (deviance) to its degrees of freedom (df). If there is no overdispersion the chi-square and model df will be approximately equal. If the model chi-square is substantially larger than its df we have overdispersion in the counts. In this case we can estimate a scaling parameter equal to sqrt(deviance/df). The standard errors for individual regression estimates then need to be multiplied by this scaling factor and the Wald chi-square for the overall effects need to be divided by the scaling factor.

This rescaling can be performed automatically by SPSS using the generalized linear model procedure which has a scale parameter option available by clicking on the estimation tab. This will give correct parameter standard errors and Wald chi-square values. You can also get underdispersion if the deviance is substantially smaller than its degrees of freedom.

Alternatively generalized linear models assuming a negative binomial response may be fitted to account of overdispersion (e.g. using GENLIN in SPSS).

Reference

Agresi A (1996) An introduction to categorical data analysis. Wiley:New York

MRC CBU Wiki

Quick Links

Search Wiki

Page Tools

Upload page content

Handling overdispersion in analysis of count data