Count Data Outcome

Is your count data overdispersed? When the variance is much larger than the mean, standard Poisson regression underestimates uncertainty. Choose the right count model for your data.

Jan 292 min read
Share
Count Data Outcome

Is the variance of your count data much larger than the mean (overdispersion)?

No / Not sure (variance is close to the mean)

- OR -

Yes (variance is substantially larger than the mean)


More Information (if you need help deciding)

No / Not sure: If the mean and variance of your count outcome are roughly equal, use Poisson Regression. This is the standard model for count data where each event occurs independently at a constant rate. If you are unsure about overdispersion, start with Poisson and check the residual deviance relative to the degrees of freedom.

Yes (overdispersion): If the variance is substantially larger than the mean, use Negative Binomial Regression. Overdispersion is common in practice, caused by unobserved heterogeneity, clustering, or excess zeros. A Poisson model will produce standard errors that are too small and p-values that are too optimistic. The Negative Binomial model adds a dispersion parameter to correctly account for the extra variability.

How to check: Fit a Poisson model first. If the residual deviance divided by the residual degrees of freedom is substantially greater than 1 (a common rule of thumb is > 1.5), overdispersion is present. You can also run a formal overdispersion test or a likelihood ratio test comparing Poisson to Negative Binomial.

Send to a friend

Share this with someone who loves clean statistical work.