Contents
Negative Binomial Regression
Negative Binomial Regression models overdispersed count data where the variance exceeds the mean. Use it when Poisson regression is too restrictive for your event counts.

Quick Hits
- •Models count outcomes when the variance is larger than the mean (overdispersion)
- •Adds a dispersion parameter to the Poisson model to handle extra variability
- •Coefficients exponentiate to incidence rate ratios, just like Poisson regression
- •Produces more conservative (wider) confidence intervals than Poisson when data is overdispersed
- •Default choice when Poisson residual deviance is much larger than degrees of freedom
The StatsTest Flow: Relationship or Prediction >> Prediction >> Count data outcome >> Overdispersion present
Not sure this is the right statistical method? Use the Choose Your StatsTest workflow to select the right method.
What is Negative Binomial Regression?
Negative Binomial Regression is a generalized linear model for count data that relaxes the Poisson assumption that the mean equals the variance. It adds a dispersion parameter that allows the variance to exceed the mean, making it appropriate for overdispersed count data.
Like Poisson regression, it uses a log link function and produces coefficients that can be exponentiated to incidence rate ratios (IRR). The key difference is that it properly accounts for extra-Poisson variability, producing more accurate standard errors, p-values, and confidence intervals.
Negative Binomial Regression is also called the NB2 Model, Negative Binomial GLM, or Overdispersed Count Model.
Assumptions for Negative Binomial Regression
The assumptions for Negative Binomial Regression include:
- Count Outcome
- Overdispersion (or at minimum, no underdispersion)
- Independence
- Log-Linear Relationship
- Negative Binomial Distribution
Count Outcome
The dependent variable must be a non-negative integer count. This is the same requirement as Poisson regression.
Overdispersion
The model is designed for data where the variance exceeds the mean. If the variance approximately equals the mean, Poisson regression is more efficient. If the variance is less than the mean (underdispersion), neither model is ideal and you may need a generalized Poisson model.
Independence
Observations must be independent. Clustered or repeated-measures count data needs mixed-effects or GEE extensions.
Log-Linear Relationship
The log of the expected count should be approximately linear in the predictors.
Negative Binomial Distribution
The model assumes the counts follow a Negative Binomial distribution, which is a Poisson-Gamma mixture. This is reasonable when overdispersion arises from unobserved heterogeneity across subjects.
When to use Negative Binomial Regression?
You should use Negative Binomial Regression in the following scenario:
- Your outcome is a count of events
- The variance is larger than the mean (overdispersion)
- You want to model which factors affect the event rate
- Observations are independent
If the variance approximately equals the mean, use Poisson Regression for more efficient estimates. If your outcome is continuous, use Linear Regression. If binary, use Logistic Regression.
Negative Binomial Regression Example
Outcome: Number of app crashes per user per week. Predictors: Device type, OS version, number of installed plugins.
Crash counts are highly overdispersed: most users experience zero or one crash, but some experience many. The variance (25.3) far exceeds the mean (2.1).
A Poisson model would underestimate the standard errors, making effects appear significant when they are not. The Negative Binomial model correctly accounts for the extra variability. After fitting, we find that users with more than 5 plugins have an IRR of 2.8 (p < 0.001), meaning they experience crashes at 2.8 times the rate of users with 0-5 plugins, controlling for device type and OS version.
References
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2726498/
- https://online.stat.psu.edu/stat504/lesson/9
Frequently Asked Questions
How do I know if I should use Negative Binomial instead of Poisson?
What causes overdispersion?
Is Negative Binomial regression always better than Poisson?
Key Takeaway
Negative Binomial regression extends Poisson regression by adding a dispersion parameter that accommodates variance larger than the mean. Use it as your default for count data when there is any suspicion of overdispersion. It produces the same interpretable incidence rate ratios as Poisson but with properly calibrated uncertainty estimates.