Library

StatsTest Blog

Experimental design, data analysis, and statistical tooling for modern teams. No hype, just the math.

Robust Standard Errors: When to Use Them (and When to Use by Default)
RegressionJan 26New

Robust Standard Errors: When to Use Them (and When to Use by Default)

A practical guide to heteroscedasticity-robust and cluster-robust standard errors. Learn when standard errors are wrong, which corrections to apply, and whether to use robust standard errors by default.

Robust Statistics Toolbox: Trimmed Means, Winsorization, and Rank Methods
AssumptionsJan 26New

Robust Statistics Toolbox: Trimmed Means, Winsorization, and Rank Methods

A practical guide to robust statistical methods that work without normality assumptions. Learn when to use trimmed means, Winsorization, M-estimators, and rank-based tests.

Sample Ratio Mismatch: Detection, Root Causes, and Solutions
A/B TestingJan 26New

Sample Ratio Mismatch: Detection, Root Causes, and Solutions

How to detect sample ratio mismatch (SRM) in A/B tests, understand its common causes, and what to do when your experiment groups have unexpected sizes.

Sequential Testing: How to Peek at P-Values Without Inflating False Positives
A/B TestingJan 26New

Sequential Testing: How to Peek at P-Values Without Inflating False Positives

Learn how sequential testing methods let you monitor A/B test results as data accumulates while maintaining valid statistical guarantees. Covers group sequential designs, always-valid inference, and practical implementation.

Statistically Significant but Meaningless: Practical Thresholds for Evals
Model EvaluationJan 26New

Statistically Significant but Meaningless: Practical Thresholds for Evals

A 0.5% accuracy improvement with p<0.001 is real but worthless. Learn how to distinguish statistically significant from practically meaningful in model evaluation.

Time-to-Event Sample Size: Practical Approximations
Survival AnalysisJan 26New

Time-to-Event Sample Size: Practical Approximations

A practical guide to sample size calculations for survival studies. Learn how to power time-to-event analyses, what drives the sample size, and practical approximations for retention experiments.

Data Transformations: When Log, Sqrt, and Box-Cox Help vs. Mislead
AssumptionsJan 26New

Data Transformations: When Log, Sqrt, and Box-Cox Help vs. Mislead

A practical guide to data transformations in statistical analysis. Learn when transformations fix problems, when they create new ones, and how to interpret results correctly.

Testing Trends Across Ordered Groups: Jonckheere-Terpstra and Alternatives
Multi-Group ComparisonsJan 26New

Testing Trends Across Ordered Groups: Jonckheere-Terpstra and Alternatives

When your groups have a natural order (dose levels, experience tiers, usage intensity), standard ANOVA ignores this structure. Learn about trend tests that leverage ordering for more power.

Two-Way ANOVA vs. Regression: Understanding Interactions for Product Teams
Multi-Group ComparisonsJan 26New

Two-Way ANOVA vs. Regression: Understanding Interactions for Product Teams

When to use two-way ANOVA versus regression for analyzing experiments with multiple factors. Covers interactions, main effects, and practical interpretation for product analytics.

Visual Diagnostics for Group Comparisons: The Plots That Matter
Multi-Group ComparisonsJan 26New

Visual Diagnostics for Group Comparisons: The Plots That Matter

How to visually check assumptions for ANOVA and other group comparisons. Covers boxplots, Q-Q plots, residual plots, and interaction plots with interpretation guidance.

Welch's T-Test vs. Student's T-Test: Why You Should Always Use Welch's
Two-Group ComparisonsJan 26New

Welch's T-Test vs. Student's T-Test: Why You Should Always Use Welch's

A definitive comparison of Welch's and Student's t-tests. Learn why the equal variance assumption fails in practice and why Welch's should be your default.

When Confidence Intervals and P-Values Seem to Disagree
Effect SizesJan 26New

When Confidence Intervals and P-Values Seem to Disagree

Understand why CIs and p-values sometimes appear to conflict and how to resolve these apparent contradictions. Learn common scenarios and the correct interpretation.