Two-Group Comparisons

Comparing Rates: Events per User, Events per Time, and Rate Ratios

How to properly compare rates like clicks per user, purchases per session, or events per hour. Covers rate ratios, Poisson tests, and common pitfalls with ratio metrics.

Share

Quick Hits

  • Rates (events/exposure) need different methods than simple counts or proportions
  • Rate ratios compare relative event rates; rate differences compare absolute rates
  • Poisson assumes variance equals mean—check for overdispersion
  • Negative binomial handles overdispersion when Poisson doesn't fit

TL;DR

Comparing rates (events per user, events per hour, etc.) requires different methods than comparing means or proportions. Poisson and negative binomial models handle the count nature of numerators. Rate ratios express relative differences; rate differences express absolute differences. Watch for overdispersion (variance > mean) which makes Poisson unreliable.


What Makes Rates Special

A rate is a count divided by exposure:

$$\text{Rate} = \frac{\text{Number of events}}{\text{Exposure (users, time, etc.)}}$$

Examples:

  • Clicks per user
  • Purchases per session
  • Errors per hour
  • Support tickets per 1,000 users

Rates inherit challenges from both counts (discrete, non-negative, often zero-inflated) and ratios (variance depends on both numerator and denominator).


Method 1: Comparing Rates Directly

For simple comparisons with reasonably large samples, you can compare rates using normal approximation.

Rate Ratio

import numpy as np
from scipy import stats

def rate_ratio_test(events1, exposure1, events2, exposure2):
    """
    Compare two rates using rate ratio.

    Rate ratio = rate2 / rate1
    """
    rate1 = events1 / exposure1
    rate2 = events2 / exposure2

    rate_ratio = rate2 / rate1

    # Log rate ratio with standard error
    log_rr = np.log(rate_ratio)
    se_log_rr = np.sqrt(1/events1 + 1/events2)

    # 95% CI for rate ratio
    ci_lower = np.exp(log_rr - 1.96 * se_log_rr)
    ci_upper = np.exp(log_rr + 1.96 * se_log_rr)

    # Z-test
    z = log_rr / se_log_rr
    p_value = 2 * (1 - stats.norm.cdf(abs(z)))

    return {
        'rate1': rate1,
        'rate2': rate2,
        'rate_ratio': rate_ratio,
        'ci_95': (ci_lower, ci_upper),
        'p_value': p_value
    }


# Example: clicks per user
control_clicks = 500
control_users = 10000

treatment_clicks = 600
treatment_users = 10000

result = rate_ratio_test(control_clicks, control_users,
                         treatment_clicks, treatment_users)

print(f"Control rate: {result['rate1']:.4f}")
print(f"Treatment rate: {result['rate2']:.4f}")
print(f"Rate ratio: {result['rate_ratio']:.2f}")
print(f"95% CI: [{result['ci_95'][0]:.2f}, {result['ci_95'][1]:.2f}]")
print(f"P-value: {result['p_value']:.4f}")

Rate Difference

def rate_difference_test(events1, exposure1, events2, exposure2):
    """
    Compare two rates using rate difference.

    Rate difference = rate2 - rate1
    """
    rate1 = events1 / exposure1
    rate2 = events2 / exposure2

    rate_diff = rate2 - rate1

    # Standard error of difference
    se_diff = np.sqrt(events1/exposure1**2 + events2/exposure2**2)

    # 95% CI
    ci_lower = rate_diff - 1.96 * se_diff
    ci_upper = rate_diff + 1.96 * se_diff

    # Z-test
    z = rate_diff / se_diff
    p_value = 2 * (1 - stats.norm.cdf(abs(z)))

    return {
        'rate_difference': rate_diff,
        'ci_95': (ci_lower, ci_upper),
        'p_value': p_value
    }


result = rate_difference_test(control_clicks, control_users,
                              treatment_clicks, treatment_users)

print(f"Rate difference: {result['rate_difference']:.4f}")
print(f"95% CI: [{result['ci_95'][0]:.4f}, {result['ci_95'][1]:.4f}]")

Method 2: Poisson Regression

For modeling event rates with covariates or handling varying exposure.

Basic Poisson Rate Comparison

import statsmodels.api as sm
import pandas as pd

def poisson_rate_comparison(events1, exposure1, events2, exposure2):
    """
    Compare rates using Poisson regression with offset.
    """
    # Create data
    df = pd.DataFrame({
        'events': [events1, events2],
        'treatment': [0, 1],
        'exposure': [exposure1, exposure2]
    })

    # Poisson model with log(exposure) as offset
    model = sm.GLM(df['events'],
                   sm.add_constant(df['treatment']),
                   family=sm.families.Poisson(),
                   offset=np.log(df['exposure']))

    result = model.fit()

    # Rate ratio is exp(treatment coefficient)
    rate_ratio = np.exp(result.params['treatment'])
    ci_lower = np.exp(result.conf_int().loc['treatment', 0])
    ci_upper = np.exp(result.conf_int().loc['treatment', 1])

    return {
        'rate_ratio': rate_ratio,
        'ci_95': (ci_lower, ci_upper),
        'p_value': result.pvalues['treatment'],
        'model_summary': result.summary()
    }

User-Level Poisson Regression

When you have individual-level data:

def user_level_rate_analysis(df, event_col, treatment_col, exposure_col=None):
    """
    Poisson regression on user-level count data.

    df: DataFrame with one row per user
    event_col: column with event counts
    treatment_col: column with treatment indicator
    exposure_col: column with exposure (optional, default=1)
    """
    if exposure_col is None:
        df = df.copy()
        df['exposure'] = 1

    formula = f'{event_col} ~ {treatment_col}'

    model = sm.GLM.from_formula(
        formula,
        data=df,
        family=sm.families.Poisson(),
        offset=np.log(df[exposure_col] if exposure_col else df['exposure'])
    )

    result = model.fit()

    return {
        'rate_ratio': np.exp(result.params[treatment_col]),
        'ci_95': (np.exp(result.conf_int().loc[treatment_col, 0]),
                  np.exp(result.conf_int().loc[treatment_col, 1])),
        'p_value': result.pvalues[treatment_col]
    }

R Implementation

# Poisson regression with offset
model <- glm(events ~ treatment + offset(log(exposure)),
             family = poisson(), data = df)
summary(model)
exp(coef(model))  # Rate ratios
exp(confint(model))  # CIs for rate ratios

The Overdispersion Problem

Poisson assumes variance equals mean. Real count data often has variance > mean (overdispersion).

Detecting Overdispersion

def check_overdispersion(events, predicted_means):
    """
    Check for overdispersion in Poisson model.

    Returns ratio of residual deviance to degrees of freedom.
    Ratio >> 1 suggests overdispersion.
    """
    # Pearson residuals
    residuals = (events - predicted_means) / np.sqrt(predicted_means)

    # Dispersion parameter estimate
    n = len(events)
    p = 1  # number of parameters (adjust as needed)
    dispersion = np.sum(residuals**2) / (n - p)

    return {
        'dispersion_estimate': dispersion,
        'overdispersed': dispersion > 1.5,
        'interpretation': 'overdispersed' if dispersion > 1.5 else 'OK'
    }


# Simulate overdispersed data
np.random.seed(42)
# Generate user-level data with extra variation
user_means = np.random.gamma(2, 2, 1000)  # Heterogeneous rates
events = np.random.poisson(user_means)

# Check dispersion
variance = np.var(events)
mean = np.mean(events)
print(f"Mean: {mean:.2f}, Variance: {variance:.2f}")
print(f"Variance/Mean ratio: {variance/mean:.2f}")  # Should be ~1 for Poisson

Consequences of Ignoring Overdispersion

  • Standard errors are too small
  • P-values are too optimistic
  • Confidence intervals are too narrow
  • False positive rate inflates

Method 3: Negative Binomial Regression

Handles overdispersion by adding a dispersion parameter.

import statsmodels.api as sm
import statsmodels.formula.api as smf

def negative_binomial_rate_comparison(df, event_col, treatment_col, exposure_col):
    """
    Negative binomial regression for overdispersed count data.
    """
    formula = f'{event_col} ~ {treatment_col}'

    model = smf.negativebinomial(
        formula,
        data=df,
        offset=np.log(df[exposure_col])
    )

    result = model.fit()

    return {
        'rate_ratio': np.exp(result.params[treatment_col]),
        'ci_95': (np.exp(result.conf_int().loc[treatment_col, 0]),
                  np.exp(result.conf_int().loc[treatment_col, 1])),
        'p_value': result.pvalues[treatment_col],
        'alpha': result.params['alpha']  # Dispersion parameter
    }

R Implementation

library(MASS)

# Negative binomial regression
model <- glm.nb(events ~ treatment + offset(log(exposure)), data = df)
summary(model)
exp(coef(model))

Poisson vs. Negative Binomial

Aspect Poisson Negative Binomial
Variance assumption Var = Mean Var = Mean + α×Mean²
Handles overdispersion No Yes
Parameters 1 (mean) 2 (mean, dispersion)
When to use Var ≈ Mean Var > Mean

Method 4: Quasi-Poisson (Robust SE)

Alternative to negative binomial: use Poisson but with robust standard errors.

def quasi_poisson_comparison(df, event_col, treatment_col, exposure_col):
    """
    Poisson model with robust standard errors.
    """
    formula = f'{event_col} ~ {treatment_col}'

    model = sm.GLM.from_formula(
        formula,
        data=df,
        family=sm.families.Poisson(),
        offset=np.log(df[exposure_col])
    )

    # Fit with robust (sandwich) standard errors
    result = model.fit(cov_type='HC1')

    return {
        'rate_ratio': np.exp(result.params[treatment_col]),
        'robust_se': result.bse[treatment_col],
        'p_value': result.pvalues[treatment_col]
    }

Common Pitfalls

Ignoring Exposure Differences

If users have different observation periods, you can't just compare total events.

Wrong: User A (observed 30 days) had 10 events, User B (observed 7 days) had 5 events. A > B.

Right: Rate A = 10/30 = 0.33/day, Rate B = 5/7 = 0.71/day. B > A.

Treating Rates Like Proportions

Rates can exceed 1 (unlike proportions). Don't use proportion tests on rates.

Averaging Ratios

The average of user-level ratios ≠ ratio of totals. Use proper methods that handle the aggregation correctly.

# Example of the problem
user_events = [2, 5, 10]
user_exposures = [10, 10, 100]

# Average of individual rates
avg_rate = np.mean(np.array(user_events) / np.array(user_exposures))

# Overall rate
overall_rate = sum(user_events) / sum(user_exposures)

print(f"Average of rates: {avg_rate:.3f}")
print(f"Overall rate: {overall_rate:.3f}")
# These differ! Overall is usually what you want.

Decision Guide

Is your metric a rate (events / exposure)?
│
├── YES: Is exposure the same for all units?
│   │
│   ├── YES: Simple rate comparison or t-test may work
│   │
│   └── NO: Use regression with offset for exposure
│
└── Is variance close to mean?
    │
    ├── YES: Poisson regression
    │
    └── NO (overdispersed): Negative binomial or quasi-Poisson


Key Takeaway

Rates are ratios of counts to exposure, and they need methods designed for that structure. Poisson regression is the starting point for rate comparisons (using an offset for exposure), but check for overdispersion—if variance exceeds the mean, use negative binomial regression or robust standard errors. Always be clear about whether you're reporting rate ratios (relative) or rate differences (absolute).


References

  1. https://www.jstor.org/stable/2530848
  2. https://www.jstor.org/stable/2983586
  3. Hilbe, J. M. (2014). *Modeling Count Data*. Cambridge University Press.
  4. Cameron, A. C., & Trivedi, P. K. (2013). *Regression Analysis of Count Data* (2nd ed.). Cambridge University Press.
  5. Frome, E. L. (1983). The analysis of rates using Poisson regression models. *Biometrics*, 39(3), 665-674.

Frequently Asked Questions

When should I use rate ratios vs. rate differences?
Rate ratios express relative change ('treatment has 20% more events'). Rate differences express absolute change ('treatment has 0.5 more events per user'). Choose based on what's more meaningful for your decision.
My variance is larger than my mean. What should I do?
This is overdispersion—common in real data. Use negative binomial models instead of Poisson, or use robust standard errors.
Can I just use a t-test on events per user?
Sometimes, especially with large samples. But t-tests can struggle with the discrete, often zero-inflated nature of count data. Poisson or negative binomial regression is often more appropriate.

Key Takeaway

Rates are ratios of counts to exposure, and they need methods designed for that structure. Poisson regression is the starting point for rate comparisons, but check for overdispersion—if variance exceeds the mean, use negative binomial or robust standard errors.

Send to a friend

Share this with someone who loves clean statistical work.