Bayesian Methods

Practical Bayes: Using PyMC, Stan, and brms for Real Analysis

Hands-on guide to Bayesian tools. Compare PyMC, Stan, and brms with real examples. Learn which tool fits your workflow and how to get started fast.

Jan 297 min readstatstest_flow Bayesian Methods Supporting

Practical Bayes: Using PyMC, Stan, and brms for Real Analysis

Quick Hits

•PyMC (Python): Best for custom models, deep integration with the Python data science stack
•Stan (multi-language): Best for complex hierarchical models and production-grade MCMC performance
•brms (R): Best for analysts familiar with R's formula syntax -- the fastest path from lm() to Bayesian
•All three tools use MCMC sampling -- the concepts transfer across tools
•Start with brms or PyMC for typical product analytics; move to Stan for custom or performance-critical models

TL;DR

Three tools dominate modern Bayesian analysis: PyMC (Python), Stan (multi-language), and brms (R). This guide shows you how to fit the same model in all three, compares their strengths and weaknesses, and helps you choose the right tool for your workflow. All three are production-ready and can handle typical product analytics problems.

Tool Comparison at a Glance

Feature	PyMC	Stan	brms
Language	Python	C++ (R/Python/etc. interfaces)	R
Learning curve	Moderate	Steep	Gentle
Model flexibility	High	Very high	Moderate (formula-based)
MCMC sampler	NUTS (JAX/NumPyro backend)	NUTS (best-in-class)	Uses Stan backend
Speed	Fast	Fastest	Fast (via Stan)
Diagnostics	Built-in (ArviZ)	Built-in + ShinyStan	Built-in
Custom models	Full control	Full control	Limited to formula syntax
Community	Large, growing	Large, established	Large, R-focused
Best for	Python data scientists	Complex/custom models	R analysts transitioning to Bayes

The Same Model in Three Tools

The Problem

You want to estimate the effect of a new checkout flow on conversion rate, controlling for user segment (new vs. returning).

Model: Logistic regression with treatment indicator and user segment.

$\text{logit}(p_i) = \beta_0 + \beta_1 \cdot \text{treatment}_i + \beta_2 \cdot \text{returning}_i$

PyMC (Python)

import pymc as pm
import numpy as np
import arviz as az

# Simulate data
np.random.seed(42)
n = 5000
treatment = np.random.binomial(1, 0.5, n)
returning = np.random.binomial(1, 0.4, n)
logit_p = -2.0 + 0.3 * treatment + 0.5 * returning
p = 1 / (1 + np.exp(-logit_p))
converted = np.random.binomial(1, p, n)

# Model
with pm.Model() as checkout_model:
    # Priors
    beta_0 = pm.Normal('intercept', mu=0, sigma=2)
    beta_treatment = pm.Normal('treatment_effect', mu=0, sigma=1)
    beta_returning = pm.Normal('returning_effect', mu=0, sigma=1)

    # Linear predictor
    logit_p = beta_0 + beta_treatment * treatment + beta_returning * returning

    # Likelihood
    y = pm.Bernoulli('converted', logit_p=logit_p, observed=converted)

    # Sample
    trace = pm.sample(2000, tune=1000, cores=4, random_seed=42)

# Results
summary = az.summary(trace, var_names=['intercept', 'treatment_effect', 'returning_effect'])
print(summary)

# Probability treatment is positive
treatment_samples = trace.posterior['treatment_effect'].values.flatten()
print(f"\nP(treatment effect > 0): {np.mean(treatment_samples > 0):.1%}")

Strengths: Pythonic API, integrates with pandas/numpy/matplotlib, ArviZ for diagnostics.

Stan (via PyStan or CmdStanPy)

// checkout_model.stan
data {
  int<lower=0> N;
  array[N] int<lower=0,upper=1> converted;
  vector[N] treatment;
  vector[N] returning;
}

parameters {
  real intercept;
  real treatment_effect;
  real returning_effect;
}

model {
  // Priors
  intercept ~ normal(0, 2);
  treatment_effect ~ normal(0, 1);
  returning_effect ~ normal(0, 1);

  // Likelihood
  converted ~ bernoulli_logit(intercept +
                               treatment_effect * treatment +
                               returning_effect * returning);
}

generated quantities {
  real prob_treatment_positive = treatment_effect > 0 ? 1 : 0;
}

# Python interface (CmdStanPy)
from cmdstanpy import CmdStanModel

model = CmdStanModel(stan_file='checkout_model.stan')
data = {
    'N': n,
    'converted': converted.tolist(),
    'treatment': treatment.tolist(),
    'returning': returning.tolist()
}
fit = model.sample(data=data, chains=4, iter_sampling=2000, iter_warmup=1000)
print(fit.summary())

Strengths: Fastest sampler, best for complex custom models, excellent diagnostics.

brms (R)

library(brms)

# Data
set.seed(42)
n <- 5000
df <- data.frame(
  treatment = rbinom(n, 1, 0.5),
  returning = rbinom(n, 1, 0.4)
)
df$p <- plogis(-2.0 + 0.3 * df$treatment + 0.5 * df$returning)
df$converted <- rbinom(n, 1, df$p)

# Model (looks just like glm!)
fit <- brm(
  converted ~ treatment + returning,
  data = df,
  family = bernoulli(link = "logit"),
  prior = c(
    prior(normal(0, 2), class = "Intercept"),
    prior(normal(0, 1), class = "b")
  ),
  chains = 4,
  iter = 3000,
  warmup = 1000,
  seed = 42
)

# Results
summary(fit)

# Probability treatment is positive
posterior <- as_draws_df(fit)
cat("P(treatment > 0):", mean(posterior$b_treatment > 0), "\n")

Strengths: Formula syntax identical to glm(), easiest transition for R users, powerful built-in plot functions.

Choosing Your Tool

Decision Framework

Are you a Python user?
├── Yes
│   ├── Standard models (regression, A/B tests)?
│   │   └── PyMC (easiest Python option)
│   └── Custom or complex models?
│       ├── PyMC can express it → PyMC
│       └── Need maximum performance → Stan (via CmdStanPy)
└── No (R user)
    ├── Standard models?
    │   └── brms (formula syntax, fastest setup)
    └── Custom models brms cannot express?
        └── Stan (via RStan or CmdStanR)

When Each Tool Shines

PyMC is best when:

You work in Python and want to stay in one ecosystem
You need custom model components (mixture models, custom likelihoods)
You want seamless integration with pandas, scikit-learn, and matplotlib

Stan is best when:

You need maximum MCMC performance (large datasets, complex models)
You have a hierarchical model with many group levels
You need to deploy models in production with guaranteed performance
Your model requires custom MCMC tuning

brms is best when:

You are an R user familiar with formula syntax
You want the fastest path from idea to Bayesian results
Your model fits the regression framework (linear, logistic, multilevel)
You need beautiful default plots and summaries

Essential Diagnostics

Regardless of which tool you use, always check these diagnostics:

R-hat (Convergence)

# PyMC / ArviZ
import arviz as az
summary = az.summary(trace)
# R-hat should be < 1.01 for all parameters
print(summary[['r_hat']])

R-hat measures whether multiple MCMC chains agree. Values above 1.01 suggest the chains have not converged. Run longer or reparameterize the model.

Effective Sample Size (ESS)

# ESS should be > 400 for reliable inference
print(summary[['ess_bulk', 'ess_tail']])

ESS tells you how many independent samples your MCMC chain is worth. Low ESS means high autocorrelation -- run longer chains or reparameterize.

Divergences (Stan/PyMC)

# Check for divergent transitions
divergences = trace.sample_stats['diverging'].sum().values
print(f"Divergent transitions: {divergences}")
# Should be 0. If not, reparameterize or increase adapt_delta.

Divergences indicate the sampler struggled with the posterior geometry. They can cause biased results. Fix by reparameterizing or adjusting sampler settings.

Posterior Predictive Check

# Does the model reproduce the observed data pattern?
with checkout_model:
    ppc = pm.sample_posterior_predictive(trace)

# Compare predicted vs. observed conversion rates
pred_rate = ppc.posterior_predictive['converted'].mean(dim=['chain', 'draw']).values.mean()
obs_rate = converted.mean()
print(f"Observed conversion rate: {obs_rate:.1%}")
print(f"Predicted conversion rate: {pred_rate:.1%}")

Getting Started Checklist

Install your chosen tool:
- PyMC: pip install pymc
- Stan: pip install cmdstanpy && install_cmdstan
- brms: install.packages("brms") in R
Start with a simple model: Bayesian A/B test (Beta-Binomial) or simple regression
Check diagnostics: R-hat, ESS, divergences, posterior predictive checks
Compare with frequentist results: For your first model, run both and verify they agree (with uninformative priors, they should)
Gradually increase complexity: Add priors, then hierarchical structure, then custom components

Common Patterns in Product Analytics

A/B Test Analysis

# PyMC: Quick Bayesian A/B test
with pm.Model() as ab_model:
    p_control = pm.Beta('p_control', 1, 1)
    p_treatment = pm.Beta('p_treatment', 1, 1)
    lift = pm.Deterministic('lift', p_treatment - p_control)

    pm.Binomial('control_obs', n=10000, p=p_control, observed=1200)
    pm.Binomial('treatment_obs', n=10000, p=p_treatment, observed=1280)

    trace = pm.sample(2000, tune=1000)

print(f"P(treatment better): {(trace.posterior['lift'] > 0).mean().values:.1%}")

Hierarchical A/B Test by Segment

# brms: Experiment effect varying by country
fit <- brm(
  converted ~ treatment + (treatment | country),
  data = experiment_data,
  family = bernoulli(),
  prior = c(
    prior(normal(0, 1), class = "b"),
    prior(cauchy(0, 0.5), class = "sd")
  )
)

Bayesian Methods Overview (Pillar) - When and why to go Bayesian
Bayesian A/B Testing - Testing with posteriors
Bayesian Regression - Regression with priors
Hierarchical Models - Multi-level modeling
Prior Selection - Choosing priors for your models

Key Takeaway

PyMC, Stan, and brms make Bayesian analysis accessible for product analysts. PyMC is the best Python option for custom models. Stan is the gold standard for complex hierarchical models. brms is the fastest on-ramp for R users. All three use the same underlying approach (MCMC sampling) and produce the same type of output (posterior samples). Choose based on your language preference and model complexity, then invest in learning diagnostics -- that is where most practical issues arise.

References

https://www.pymc.io/welcome.html
https://mc-stan.org/users/documentation/
https://paul-buerkner.github.io/brms/

Frequently Asked Questions

Which tool should I start with?

If you use Python, start with PyMC. If you use R, start with brms. Both have excellent documentation and can handle 90% of product analytics use cases. Move to Stan when you need custom models, maximum performance, or when PyMC/brms cannot express your model.

How long does MCMC sampling take?

For simple models (A/B test, basic regression) with a few thousand observations: seconds. For hierarchical models with many groups: minutes. For complex custom models with millions of observations: minutes to hours. Stan's No-U-Turn Sampler (NUTS) is the gold standard for efficiency. PyMC also uses NUTS and is comparably fast for most models.

Do I need to understand MCMC to use these tools?

You need to understand the basics: MCMC draws samples from the posterior, you use those samples for inference, and you should check diagnostics (R-hat, effective sample size, divergences). You do not need to implement MCMC yourself. The tools handle the sampling; you focus on model specification and result interpretation.

Key Takeaway

Send to a friend

Share this with someone who loves clean statistical work.

Facebook X Reddit LinkedIn Email