Bayesian Methods

Practical Bayes: Using PyMC, Stan, and brms for Real Analysis

Hands-on guide to Bayesian tools. Compare PyMC, Stan, and brms with real examples. Learn which tool fits your workflow and how to get started fast.

Share

Quick Hits

  • PyMC (Python): Best for custom models, deep integration with the Python data science stack
  • Stan (multi-language): Best for complex hierarchical models and production-grade MCMC performance
  • brms (R): Best for analysts familiar with R's formula syntax -- the fastest path from lm() to Bayesian
  • All three tools use MCMC sampling -- the concepts transfer across tools
  • Start with brms or PyMC for typical product analytics; move to Stan for custom or performance-critical models

TL;DR

Three tools dominate modern Bayesian analysis: PyMC (Python), Stan (multi-language), and brms (R). This guide shows you how to fit the same model in all three, compares their strengths and weaknesses, and helps you choose the right tool for your workflow. All three are production-ready and can handle typical product analytics problems.


Tool Comparison at a Glance

Feature PyMC Stan brms
Language Python C++ (R/Python/etc. interfaces) R
Learning curve Moderate Steep Gentle
Model flexibility High Very high Moderate (formula-based)
MCMC sampler NUTS (JAX/NumPyro backend) NUTS (best-in-class) Uses Stan backend
Speed Fast Fastest Fast (via Stan)
Diagnostics Built-in (ArviZ) Built-in + ShinyStan Built-in
Custom models Full control Full control Limited to formula syntax
Community Large, growing Large, established Large, R-focused
Best for Python data scientists Complex/custom models R analysts transitioning to Bayes

The Same Model in Three Tools

The Problem

You want to estimate the effect of a new checkout flow on conversion rate, controlling for user segment (new vs. returning).

Model: Logistic regression with treatment indicator and user segment.

logit(pi)=β0+β1treatmenti+β2returningi\text{logit}(p_i) = \beta_0 + \beta_1 \cdot \text{treatment}_i + \beta_2 \cdot \text{returning}_i

PyMC (Python)

import pymc as pm
import numpy as np
import arviz as az

# Simulate data
np.random.seed(42)
n = 5000
treatment = np.random.binomial(1, 0.5, n)
returning = np.random.binomial(1, 0.4, n)
logit_p = -2.0 + 0.3 * treatment + 0.5 * returning
p = 1 / (1 + np.exp(-logit_p))
converted = np.random.binomial(1, p, n)

# Model
with pm.Model() as checkout_model:
    # Priors
    beta_0 = pm.Normal('intercept', mu=0, sigma=2)
    beta_treatment = pm.Normal('treatment_effect', mu=0, sigma=1)
    beta_returning = pm.Normal('returning_effect', mu=0, sigma=1)

    # Linear predictor
    logit_p = beta_0 + beta_treatment * treatment + beta_returning * returning

    # Likelihood
    y = pm.Bernoulli('converted', logit_p=logit_p, observed=converted)

    # Sample
    trace = pm.sample(2000, tune=1000, cores=4, random_seed=42)

# Results
summary = az.summary(trace, var_names=['intercept', 'treatment_effect', 'returning_effect'])
print(summary)

# Probability treatment is positive
treatment_samples = trace.posterior['treatment_effect'].values.flatten()
print(f"\nP(treatment effect > 0): {np.mean(treatment_samples > 0):.1%}")

Strengths: Pythonic API, integrates with pandas/numpy/matplotlib, ArviZ for diagnostics.

Stan (via PyStan or CmdStanPy)

// checkout_model.stan
data {
  int<lower=0> N;
  array[N] int<lower=0,upper=1> converted;
  vector[N] treatment;
  vector[N] returning;
}

parameters {
  real intercept;
  real treatment_effect;
  real returning_effect;
}

model {
  // Priors
  intercept ~ normal(0, 2);
  treatment_effect ~ normal(0, 1);
  returning_effect ~ normal(0, 1);

  // Likelihood
  converted ~ bernoulli_logit(intercept +
                               treatment_effect * treatment +
                               returning_effect * returning);
}

generated quantities {
  real prob_treatment_positive = treatment_effect > 0 ? 1 : 0;
}
# Python interface (CmdStanPy)
from cmdstanpy import CmdStanModel

model = CmdStanModel(stan_file='checkout_model.stan')
data = {
    'N': n,
    'converted': converted.tolist(),
    'treatment': treatment.tolist(),
    'returning': returning.tolist()
}
fit = model.sample(data=data, chains=4, iter_sampling=2000, iter_warmup=1000)
print(fit.summary())

Strengths: Fastest sampler, best for complex custom models, excellent diagnostics.

brms (R)

library(brms)

# Data
set.seed(42)
n <- 5000
df <- data.frame(
  treatment = rbinom(n, 1, 0.5),
  returning = rbinom(n, 1, 0.4)
)
df$p <- plogis(-2.0 + 0.3 * df$treatment + 0.5 * df$returning)
df$converted <- rbinom(n, 1, df$p)

# Model (looks just like glm!)
fit <- brm(
  converted ~ treatment + returning,
  data = df,
  family = bernoulli(link = "logit"),
  prior = c(
    prior(normal(0, 2), class = "Intercept"),
    prior(normal(0, 1), class = "b")
  ),
  chains = 4,
  iter = 3000,
  warmup = 1000,
  seed = 42
)

# Results
summary(fit)

# Probability treatment is positive
posterior <- as_draws_df(fit)
cat("P(treatment > 0):", mean(posterior$b_treatment > 0), "\n")

Strengths: Formula syntax identical to glm(), easiest transition for R users, powerful built-in plot functions.


Choosing Your Tool

Decision Framework

Are you a Python user?
├── Yes
│   ├── Standard models (regression, A/B tests)?
│   │   └── PyMC (easiest Python option)
│   └── Custom or complex models?
│       ├── PyMC can express it → PyMC
│       └── Need maximum performance → Stan (via CmdStanPy)
└── No (R user)
    ├── Standard models?
    │   └── brms (formula syntax, fastest setup)
    └── Custom models brms cannot express?
        └── Stan (via RStan or CmdStanR)

When Each Tool Shines

PyMC is best when:

  • You work in Python and want to stay in one ecosystem
  • You need custom model components (mixture models, custom likelihoods)
  • You want seamless integration with pandas, scikit-learn, and matplotlib

Stan is best when:

  • You need maximum MCMC performance (large datasets, complex models)
  • You have a hierarchical model with many group levels
  • You need to deploy models in production with guaranteed performance
  • Your model requires custom MCMC tuning

brms is best when:

  • You are an R user familiar with formula syntax
  • You want the fastest path from idea to Bayesian results
  • Your model fits the regression framework (linear, logistic, multilevel)
  • You need beautiful default plots and summaries

Essential Diagnostics

Regardless of which tool you use, always check these diagnostics:

R-hat (Convergence)

# PyMC / ArviZ
import arviz as az
summary = az.summary(trace)
# R-hat should be < 1.01 for all parameters
print(summary[['r_hat']])

R-hat measures whether multiple MCMC chains agree. Values above 1.01 suggest the chains have not converged. Run longer or reparameterize the model.

Effective Sample Size (ESS)

# ESS should be > 400 for reliable inference
print(summary[['ess_bulk', 'ess_tail']])

ESS tells you how many independent samples your MCMC chain is worth. Low ESS means high autocorrelation -- run longer chains or reparameterize.

Divergences (Stan/PyMC)

# Check for divergent transitions
divergences = trace.sample_stats['diverging'].sum().values
print(f"Divergent transitions: {divergences}")
# Should be 0. If not, reparameterize or increase adapt_delta.

Divergences indicate the sampler struggled with the posterior geometry. They can cause biased results. Fix by reparameterizing or adjusting sampler settings.

Posterior Predictive Check

# Does the model reproduce the observed data pattern?
with checkout_model:
    ppc = pm.sample_posterior_predictive(trace)

# Compare predicted vs. observed conversion rates
pred_rate = ppc.posterior_predictive['converted'].mean(dim=['chain', 'draw']).values.mean()
obs_rate = converted.mean()
print(f"Observed conversion rate: {obs_rate:.1%}")
print(f"Predicted conversion rate: {pred_rate:.1%}")

Getting Started Checklist

  1. Install your chosen tool:

    • PyMC: pip install pymc
    • Stan: pip install cmdstanpy && install_cmdstan
    • brms: install.packages("brms") in R
  2. Start with a simple model: Bayesian A/B test (Beta-Binomial) or simple regression

  3. Check diagnostics: R-hat, ESS, divergences, posterior predictive checks

  4. Compare with frequentist results: For your first model, run both and verify they agree (with uninformative priors, they should)

  5. Gradually increase complexity: Add priors, then hierarchical structure, then custom components


Common Patterns in Product Analytics

A/B Test Analysis

# PyMC: Quick Bayesian A/B test
with pm.Model() as ab_model:
    p_control = pm.Beta('p_control', 1, 1)
    p_treatment = pm.Beta('p_treatment', 1, 1)
    lift = pm.Deterministic('lift', p_treatment - p_control)

    pm.Binomial('control_obs', n=10000, p=p_control, observed=1200)
    pm.Binomial('treatment_obs', n=10000, p=p_treatment, observed=1280)

    trace = pm.sample(2000, tune=1000)

print(f"P(treatment better): {(trace.posterior['lift'] > 0).mean().values:.1%}")

Hierarchical A/B Test by Segment

# brms: Experiment effect varying by country
fit <- brm(
  converted ~ treatment + (treatment | country),
  data = experiment_data,
  family = bernoulli(),
  prior = c(
    prior(normal(0, 1), class = "b"),
    prior(cauchy(0, 0.5), class = "sd")
  )
)


Key Takeaway

PyMC, Stan, and brms make Bayesian analysis accessible for product analysts. PyMC is the best Python option for custom models. Stan is the gold standard for complex hierarchical models. brms is the fastest on-ramp for R users. All three use the same underlying approach (MCMC sampling) and produce the same type of output (posterior samples). Choose based on your language preference and model complexity, then invest in learning diagnostics -- that is where most practical issues arise.


References

  1. https://www.pymc.io/welcome.html
  2. https://mc-stan.org/users/documentation/
  3. https://paul-buerkner.github.io/brms/

Frequently Asked Questions

Which tool should I start with?
If you use Python, start with PyMC. If you use R, start with brms. Both have excellent documentation and can handle 90% of product analytics use cases. Move to Stan when you need custom models, maximum performance, or when PyMC/brms cannot express your model.
How long does MCMC sampling take?
For simple models (A/B test, basic regression) with a few thousand observations: seconds. For hierarchical models with many groups: minutes. For complex custom models with millions of observations: minutes to hours. Stan's No-U-Turn Sampler (NUTS) is the gold standard for efficiency. PyMC also uses NUTS and is comparably fast for most models.
Do I need to understand MCMC to use these tools?
You need to understand the basics: MCMC draws samples from the posterior, you use those samples for inference, and you should check diagnostics (R-hat, effective sample size, divergences). You do not need to implement MCMC yourself. The tools handle the sampling; you focus on model specification and result interpretation.

Key Takeaway

PyMC, Stan, and brms make Bayesian analysis accessible for product analysts. PyMC is the best Python option for custom models. Stan is the gold standard for complex hierarchical models. brms is the fastest on-ramp for R users. All three use the same underlying approach (MCMC sampling) and produce the same type of output (posterior samples). Choose based on your language preference and model complexity, then invest in learning diagnostics -- that is where most practical issues arise.

Send to a friend

Share this with someone who loves clean statistical work.