Contents
Robust Standard Errors: When to Use Them (and When to Use by Default)
A practical guide to heteroscedasticity-robust and cluster-robust standard errors. Learn when standard errors are wrong, which corrections to apply, and whether to use robust standard errors by default.
Quick Hits
- •Standard SEs assume constant variance - heteroscedasticity makes them wrong
- •Robust SEs (HC0-HC3) are valid even when variance isn't constant
- •HC3 is the default choice for most applications
- •Clustering requires cluster-robust SEs - heteroscedasticity-robust isn't enough
- •Many researchers now use robust SEs by default as 'insurance'
TL;DR
Standard regression assumes constant variance of errors (homoscedasticity). When this fails, standard errors are wrong—usually too small, leading to too many false positives. Robust standard errors fix this without changing your coefficient estimates. HC3 is the recommended variant for heteroscedasticity. For clustered data, you need cluster-robust standard errors. Many researchers now advocate using robust SEs by default.
The Problem: When Standard Errors Go Wrong
What Standard OLS Assumes
Standard OLS regression assumes: $$\text{Var}(\epsilon_i) = \sigma^2 \quad \text{(constant for all } i \text{)}$$
This is homoscedasticity—equal variance.
What Heteroscedasticity Does
When variance isn't constant (heteroscedasticity):
- Coefficient estimates are still unbiased
- But standard errors are wrong
- Usually too small → inflated Type I error
- Confidence intervals have wrong coverage
Common Causes
- Scale effects: Variance increases with fitted values (e.g., revenue models)
- Group differences: Different subpopulations have different variances
- Model misspecification: Missing variables or wrong functional form
- Heavy tails: Outliers create heteroscedasticity-like effects
The Solution: Robust Standard Errors
The Idea
Instead of assuming constant variance, estimate the variance-covariance matrix of coefficients directly from the data, allowing each observation to have its own variance.
The Sandwich Estimator
Called "sandwich" because of its form:
$$\text{Var}(\hat{\beta}) = (X'X)^{-1} X' \hat{\Omega} X (X'X)^{-1}$$
The middle part ($X' \hat{\Omega} X$) is the "meat" of the sandwich, where $\hat{\Omega}$ is a diagonal matrix of squared residuals.
The HC Family: Which to Use?
HC0 (White's Original)
$$\hat{\Omega}_{ii} = \hat{e}_i^2$$
Simply uses squared residuals. Tends to be too liberal (underestimates SEs) in small samples.
HC1 (Degrees of Freedom Adjustment)
$$\hat{\Omega}_{ii} = \frac{n}{n-k} \hat{e}_i^2$$
Adjusts for the number of parameters. Better than HC0, but still often too liberal.
HC2 (Leverage Adjustment)
$$\hat{\Omega}_{ii} = \frac{\hat{e}i^2}{1-h{ii}}$$
Where $h_{ii}$ is the leverage of observation $i$. Accounts for high-leverage points.
HC3 (MacKinnon-White)
$$\hat{\Omega}_{ii} = \frac{\hat{e}i^2}{(1-h{ii})^2}$$
More aggressive adjustment for leverage. Most conservative, generally recommended.
Recommendation
Use HC3 for most applications:
- Most robust in small samples
- Good Type I error control
- Standard in economics and social sciences
Code: Robust Standard Errors
Python
import numpy as np
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf
from scipy import stats
def compare_standard_errors(model):
"""
Compare standard and robust standard errors.
Parameters:
-----------
model : statsmodels RegressionResults
Fitted OLS model
Returns:
--------
DataFrame comparing SE types
"""
# Standard SEs
std_se = model.bse
# Robust SEs (HC0-HC3)
hc0_se = model.get_robustcov_results(cov_type='HC0').bse
hc1_se = model.get_robustcov_results(cov_type='HC1').bse
hc2_se = model.get_robustcov_results(cov_type='HC2').bse
hc3_se = model.get_robustcov_results(cov_type='HC3').bse
comparison = pd.DataFrame({
'Variable': model.params.index,
'Coefficient': model.params.values,
'Standard SE': std_se.values,
'HC0': hc0_se.values,
'HC1': hc1_se.values,
'HC2': hc2_se.values,
'HC3': hc3_se.values
})
# Calculate ratio (robust / standard)
comparison['HC3/Std Ratio'] = comparison['HC3'] / comparison['Standard SE']
return comparison
def robust_regression_summary(model, cov_type='HC3'):
"""
Get regression summary with robust SEs.
"""
robust_model = model.get_robustcov_results(cov_type=cov_type)
summary = pd.DataFrame({
'Variable': robust_model.params.index,
'Coefficient': robust_model.params.values,
'Robust SE': robust_model.bse.values,
't-statistic': robust_model.tvalues.values,
'p-value': robust_model.pvalues.values,
'CI Lower': robust_model.conf_int()[0].values,
'CI Upper': robust_model.conf_int()[1].values
})
return summary, robust_model
def detect_heteroscedasticity(model):
"""
Test for heteroscedasticity using Breusch-Pagan and White tests.
"""
from statsmodels.stats.diagnostic import het_breuschpagan, het_white
residuals = model.resid
exog = model.model.exog
# Breusch-Pagan
bp_stat, bp_p, _, _ = het_breuschpagan(residuals, exog)
# White test
try:
white_stat, white_p, _, _ = het_white(residuals, exog)
except:
white_stat, white_p = np.nan, np.nan
return {
'breusch_pagan': {'statistic': bp_stat, 'p_value': bp_p},
'white': {'statistic': white_stat, 'p_value': white_p},
'heteroscedasticity_detected': bp_p < 0.05 or white_p < 0.05
}
# Example
if __name__ == "__main__":
np.random.seed(42)
n = 200
# Create heteroscedastic data
x = np.random.uniform(1, 10, n)
# Variance increases with x
y = 2 + 3*x + np.random.normal(0, 1 + 0.5*x, n)
data = pd.DataFrame({'x': x, 'y': y})
# Fit model
model = smf.ols('y ~ x', data=data).fit()
print("Heteroscedasticity Tests")
print("=" * 50)
het_results = detect_heteroscedasticity(model)
print(f"Breusch-Pagan p-value: {het_results['breusch_pagan']['p_value']:.4f}")
print(f"Heteroscedasticity detected: {het_results['heteroscedasticity_detected']}")
print("\n" + "=" * 50)
print("Standard Error Comparison")
comparison = compare_standard_errors(model)
print(comparison.to_string(index=False))
print("\n" + "=" * 50)
print("Regression with HC3 Robust SEs")
robust_summary, robust_model = robust_regression_summary(model)
print(robust_summary.to_string(index=False))
R
library(tidyverse)
library(sandwich) # For robust SEs
library(lmtest) # For coeftest
compare_standard_errors <- function(model) {
#' Compare standard and robust standard errors
# Standard SEs
std_se <- summary(model)$coefficients[, "Std. Error"]
# Robust SEs
hc0_se <- sqrt(diag(vcovHC(model, type = "HC0")))
hc1_se <- sqrt(diag(vcovHC(model, type = "HC1")))
hc2_se <- sqrt(diag(vcovHC(model, type = "HC2")))
hc3_se <- sqrt(diag(vcovHC(model, type = "HC3")))
tibble(
Variable = names(coef(model)),
Coefficient = coef(model),
`Standard SE` = std_se,
HC0 = hc0_se,
HC1 = hc1_se,
HC2 = hc2_se,
HC3 = hc3_se,
`HC3/Std Ratio` = hc3_se / std_se
)
}
robust_summary <- function(model, type = "HC3") {
#' Get model summary with robust SEs
coeftest(model, vcov = vcovHC(model, type = type))
}
robust_confint <- function(model, type = "HC3", level = 0.95) {
#' Get confidence intervals with robust SEs
robust_vcov <- vcovHC(model, type = type)
confint(model, vcov. = robust_vcov, level = level)
}
detect_heteroscedasticity <- function(model) {
#' Test for heteroscedasticity
bp_test <- lmtest::bptest(model)
# White test
white_test <- lmtest::bptest(model, ~ fitted(model) + I(fitted(model)^2))
list(
breusch_pagan = list(statistic = bp_test$statistic,
p_value = bp_test$p.value),
white = list(statistic = white_test$statistic,
p_value = white_test$p.value),
heteroscedasticity_detected = bp_test$p.value < 0.05 | white_test$p.value < 0.05
)
}
# Example
set.seed(42)
n <- 200
# Heteroscedastic data
x <- runif(n, 1, 10)
y <- 2 + 3*x + rnorm(n, 0, 1 + 0.5*x) # Variance increases with x
data <- tibble(x = x, y = y)
# Fit model
model <- lm(y ~ x, data = data)
cat("Heteroscedasticity Tests\n")
cat(strrep("=", 50), "\n")
het <- detect_heteroscedasticity(model)
cat(sprintf("Breusch-Pagan p-value: %.4f\n", het$breusch_pagan$p_value))
cat(sprintf("Detected: %s\n", het$heteroscedasticity_detected))
cat("\n", strrep("=", 50), "\n")
cat("Standard Error Comparison\n")
print(compare_standard_errors(model))
cat("\n", strrep("=", 50), "\n")
cat("Robust Summary (HC3)\n")
print(robust_summary(model))
Cluster-Robust Standard Errors
When Heteroscedasticity-Robust Isn't Enough
When observations are clustered (students in classrooms, employees in companies, repeated measures per user), errors are correlated within clusters.
Problem: Even HC3 assumes independent observations. Clustered data violates this.
Solution: Cluster-robust standard errors account for within-cluster correlation.
When to Use Cluster-Robust SEs
- Hierarchical data: Individuals nested in groups
- Panel data: Same units observed over time
- Multi-site studies: Treatment in different locations
- A/B tests with user-day observations: Same users appear multiple times
Code: Cluster-Robust SEs
# Python with statsmodels
model = smf.ols('y ~ treatment', data=data).fit(
cov_type='cluster',
cov_kwds={'groups': data['user_id']}
)
# Or post-hoc
robust_model = model.get_robustcov_results(
cov_type='cluster',
groups=data['user_id']
)
# R with sandwich
library(sandwich)
library(lmtest)
# Cluster-robust SEs
cluster_robust <- vcovCL(model, cluster = ~ user_id)
coeftest(model, vcov = cluster_robust)
Level of Clustering
Cluster at the level of the treatment assignment or the highest level of correlation:
- Treatment assigned to classrooms → cluster by classroom
- Treatment assigned to users → cluster by user
- When in doubt, cluster at a higher level (more conservative)
The Case for Default Robust SEs
Arguments For
-
Insurance policy: If homoscedasticity holds, you lose a little efficiency (~5%). If it doesn't, you avoid invalid inference.
-
Robustness: Your conclusions don't depend on an assumption that's hard to verify.
-
Standard practice: Many fields (economics, political science) now use robust SEs by default.
-
Model agnosticism: Works regardless of functional form misspecification.
Arguments Against
-
Efficiency loss: When homoscedasticity holds, standard SEs are more efficient.
-
Small samples: Robust SEs can be unreliable with small n (<50).
-
Model checking: Using robust SEs might mask model problems worth investigating.
-
Doesn't fix everything: Won't help with non-independence (need clustering).
Practical Recommendation
Use HC3 by default when:
- Sample size is moderate to large (n > 50)
- You have any doubt about constant variance
- You want results robust to assumptions
Check homoscedasticity when:
- Sample is small
- Efficiency matters (tight budget, marginal power)
- You want to understand your data better
When Robust SEs Aren't Enough
Problem: Wrong Model
Robust SEs fix inference, not estimation. If your model is wrong (wrong functional form, omitted variables), coefficients are biased—robust SEs won't help.
Problem: Clustered Data
Heteroscedasticity-robust SEs assume independence. Clustered data needs cluster-robust SEs.
Problem: Small Samples
All robust SE methods rely on large-sample theory. With n < 30-50:
- HC3 is more reliable than HC0
- Consider wild bootstrap for better small-sample inference
- Mixed effects models might be better for clustered data
Problem: Many Clusters, Few Observations per Cluster
Cluster-robust SEs need enough clusters (usually > 30-50) for reliable inference. With few clusters:
- Consider mixed effects models
- Use wild cluster bootstrap
- Report cautiously
Summary: Which SE to Use When
| Situation | Recommended SE |
|---|---|
| Standard regression, large n | HC3 or standard |
| Suspected heteroscedasticity | HC3 |
| Clustered data, many clusters | Cluster-robust |
| Clustered data, few clusters | Wild cluster bootstrap |
| Panel data | Two-way clustering or panel methods |
| Small sample (n < 50) | HC3 with caution, consider bootstrap |
Reporting Guidelines
When using robust SEs:
-
State which correction: "We report HC3 heteroscedasticity-robust standard errors"
-
Justify clustering level: "Standard errors clustered at the user level to account for repeated observations"
-
Compare when relevant: "Results are similar with standard SEs (see Appendix)"
-
Don't over-interpret: Robust SEs protect inference but don't validate the model
Related Methods
- Regression for Analysts (Pillar) - Complete regression framework
- Linear Regression Diagnostics - Detecting heteroscedasticity
- Independence Assumption - When clustering matters
- Bootstrap Confidence Intervals - Alternative inference
Key Takeaway
Standard regression assumes constant error variance. When this fails, standard errors are wrong—usually too small, causing too many false positives. Robust standard errors (HC3 recommended) fix inference without changing coefficients. For clustered data, heteroscedasticity-robust isn't enough—you need cluster-robust SEs. Many researchers now use robust SEs by default: the efficiency loss when they're unnecessary is small, but the cost of wrong inference when they're needed is large.
References
- https://doi.org/10.1016/j.jeconom.2004.02.005
- https://www.jstatsoft.org/article/view/v095i01
- https://arxiv.org/abs/1903.03949
- MacKinnon, J. G., & White, H. (1985). Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties. *Journal of Econometrics*, 29(3), 305-325.
- Zeileis, A., Köll, S., & Graham, N. (2020). Various versatile variances: An object-oriented implementation of clustered covariances in R. *Journal of Statistical Software*, 95(1), 1-36.
- Cameron, A. C., & Miller, D. L. (2015). A practitioner's guide to cluster-robust inference. *Journal of Human Resources*, 50(2), 317-372.
Frequently Asked Questions
Do robust standard errors change my coefficients?
Should I use robust standard errors by default?
What's the difference between HC0, HC1, HC2, and HC3?
Key Takeaway
Robust standard errors protect your inference when the constant variance assumption fails. HC3 is the default choice for heteroscedasticity; cluster-robust SEs are needed for clustered data. Consider using robust SEs by default as insurance—the cost when you don't need them is small, but the cost of wrong inference is large.