Contents
Mediation Analysis: Does Feature X Work Through Mechanism Y?
How mediation analysis identifies the causal mechanisms behind product effects. Learn when and how to decompose total effects into direct and indirect paths.
Quick Hits
- •Mediation analysis decomposes a total causal effect into an indirect path (through the mediator) and a direct path (everything else)
- •The classic Baron and Kenny approach is outdated; modern causal mediation analysis based on potential outcomes is more rigorous
- •Sequential ignorability -- no unmeasured confounders of the mediator-outcome relationship -- is the critical and often unrealistic assumption
- •Even with experimental treatment assignment, the mediator is observational, making unmeasured confounding of the M-Y path the central threat
- •Understanding mechanisms helps product teams decide what to optimize: if a feature works through engagement but not comprehension, double down on engagement design
TL;DR
When a product change moves a metric, the natural next question is: why? Mediation analysis answers this by decomposing the total treatment effect into a direct effect and an indirect effect that flows through a hypothesized mechanism. It is the formal framework for answering "does Feature X work because it increases Engagement, or for some other reason?" This post covers the modern causal approach to mediation, the assumptions that make it valid, and how product teams can use it to make better decisions about what to build next.
Why Mechanisms Matter in Product Analytics
Knowing that a feature works is valuable. Knowing how it works is more valuable. Consider a scenario where you launched an in-app tutorial and observed a 15% lift in user activation. Two possible mechanisms:
- The tutorial teaches users how to use the product, reducing confusion.
- The tutorial forces users to interact with key features, creating habit loops.
If mechanism 1 dominates, you should invest in better educational content. If mechanism 2 dominates, you should invest in guided workflows and activation checklists. The treatment is the same; the optimal next step depends on the mechanism.
Mediation analysis formalizes this decomposition.
The Causal Framework
Setup
- Treatment (T): The intervention (e.g., tutorial shown vs. not shown).
- Mediator (M): The hypothesized mechanism (e.g., feature exploration in the first session).
- Outcome (Y): The metric you care about (e.g., 30-day activation).
The causal DAG looks like:
T --> M --> Y
T ---------> Y
The total effect of on decomposes into:
- Indirect effect (IE): The portion of the effect that flows through . ()
- Direct effect (DE): Everything else. ( not through )
Modern Causal Definitions
Using potential outcomes notation (Imai, Keele, Tingley, 2010):
- Average Causal Mediation Effect (ACME): -- the effect of changing the mediator from what it would be under control to what it would be under treatment, while holding treatment at treated.
- Average Direct Effect (ADE): -- the effect of changing treatment while holding the mediator at its control value.
These definitions avoid the pitfalls of the classic Baron and Kenny approach, which relies on linear models and implicitly assumes no interaction between treatment and mediator.
The Classic vs. Modern Approach
Baron and Kenny (1986) -- Outdated but Still Common
The classic approach runs three regressions:
- Regress on : check that treatment affects the outcome.
- Regress on : check that treatment affects the mediator.
- Regress on and : check that the mediator affects the outcome after controlling for treatment, and that the treatment coefficient shrinks.
The indirect effect is estimated as the product of coefficients: (where is the effect of on and is the effect of on controlling for ).
Problems: This approach assumes linear relationships, no treatment-mediator interaction, and makes strong parametric assumptions. It also conflates statistical significance of individual paths with evidence of mediation.
Modern Causal Mediation Analysis
The modern approach, based on the potential outcomes framework:
- Defines direct and indirect effects using counterfactuals.
- Works with any type of outcome and mediator (binary, continuous, count).
- Allows treatment-mediator interactions.
- Makes the identifying assumptions explicit (especially sequential ignorability).
- Provides sensitivity analysis tools for violations of those assumptions.
Use the mediation package in R or equivalent implementations in Python.
The Critical Assumption: Sequential Ignorability
For mediation analysis to yield causal estimates, you need sequential ignorability:
- No unmeasured confounders of T and Y. If treatment is randomized, this is satisfied by design.
- No unmeasured confounders of M and Y, conditional on T and covariates. This is the hard part.
Even if you randomized treatment, the mediator is not randomized. Users who engage with a tutorial more deeply may differ from those who skim it, and those differences may also affect the outcome. This is the same challenge as any observational causal inference problem: confounding of a non-randomized variable.
Sequential ignorability is strong and often implausible. This does not mean mediation analysis is useless -- it means you must:
- Be transparent about the assumption.
- Run sensitivity analysis to assess how much confounding would be needed to invalidate the indirect effect.
- Treat results as suggestive rather than definitive unless you have exceptional data.
A Product Analytics Example
Context: Your team launched a "social proof" feature that shows new users how many others have completed onboarding. An A/B test showed a 10% lift in 14-day retention. You hypothesize the mechanism is increased onboarding completion.
Variables:
- : Social proof feature (shown vs. not shown), randomized.
- : Onboarding completion rate within 3 days.
- : 14-day retention.
- Covariates: Sign-up source, device type, country.
Analysis:
- Estimate the mediator model: logistic regression of onboarding completion on treatment and covariates.
- Estimate the outcome model: logistic regression of 14-day retention on treatment, onboarding completion, and covariates.
- Use the
mediationpackage to compute ACME and ADE with quasi-Bayesian confidence intervals.
Results (hypothetical):
- Total effect: +10% retention (from the A/B test).
- ACME (indirect via onboarding): +7% (95% CI: +4% to +10%).
- ADE (direct): +3% (95% CI: -1% to +6%).
Interpretation: About 70% of the social proof feature's effect on retention flows through increased onboarding completion. The remaining 30% operates through other channels (perhaps a general sense of social belonging).
Decision: Invest in further improving onboarding completion, since that is the primary mechanism.
Sensitivity Analysis
Because sequential ignorability is untestable, always run a sensitivity analysis. The Imai, Keele, and Yamamoto (2010) approach parameterizes the sensitivity in terms of -- the correlation between the residuals of the mediator and outcome models.
At (no unmeasured confounding), you get your baseline ACME. As increases (stronger unmeasured confounding), the ACME changes. The question is: at what value of does the ACME cross zero?
If it takes to nullify the indirect effect, you have some confidence. If is enough, the result is fragile.
Common Pitfalls
Confusing mediation with moderation. Mediation is about how; moderation is about for whom or when. Running subgroup analyses (moderators) does not tell you about mechanisms (mediators).
Mediator measured post-treatment but pre-outcome ambiguity. Ensure the mediator is temporally between treatment and outcome. If the timeline is unclear, the causal ordering is unclear.
Conditioning on a collider. If a variable is caused by both the mediator and the outcome, including it in the model introduces confounding where none existed. Check your DAG before adding controls.
Ignoring treatment-mediator interaction. The classic approach assumes the effect of on is the same regardless of treatment status. The modern approach allows for this interaction, and ignoring it can bias both direct and indirect effect estimates.
Over-interpreting proportions. Saying "70% of the effect is mediated" sounds precise, but the proportion mediated is a ratio of estimates, each with uncertainty. The confidence interval on the proportion can be very wide. Report point estimates with intervals.
When to Use Mediation Analysis
Use mediation analysis when:
- You have a well-defined mechanism hypothesis before looking at data.
- The treatment is randomized or you can make a credible case for no confounding of .
- You have a measurable mediator that is temporally between treatment and outcome.
- You are willing to run sensitivity analysis and report it transparently.
Avoid it when:
- You are fishing for mechanisms without a prior hypothesis.
- The mediator is measured simultaneously with the outcome.
- You cannot articulate what unmeasured confounders of and might exist.
For the broader causal inference toolkit, see our overview of methods for when experiments aren't possible.
References
- https://imai.fas.harvard.edu/research/files/BaronKenny.pdf
- https://www.annualreviews.org/doi/10.1146/annurev-statistics-031219-041408
- https://cran.r-project.org/web/packages/mediation/vignettes/mediation.pdf
Frequently Asked Questions
What is the difference between a mediator and a moderator?
Can I do mediation analysis with experimental data?
How many mediators can I include?
Key Takeaway
Mediation analysis tells you why a treatment works by decomposing the total effect into direct and indirect pathways, but the mediator-outcome relationship is observational even in experiments, making the sequential ignorability assumption the key vulnerability.