StatsTest Blog
Experimental design, data analysis, and statistical tooling for modern teams. No hype, just the math.

Paired Evaluation: McNemar's Test for Before/After Classification
When the same examples are evaluated by two models, use McNemar's test for proper inference. Learn why paired analysis is more powerful and how to implement it correctly.

Paired vs. Independent Data: A Diagnostic Checklist
How to determine whether your data is paired or independent, and why getting this wrong can dramatically affect your statistical power and validity.

Percentiles and Latency: Comparing P50, P95, P99 Correctly
How to properly compare percentile metrics like latency P50, P95, and P99 across groups. Learn about bootstrap inference, quantile regression, and the pitfalls of naive percentile comparisons.

Poisson vs. Negative Binomial: Modeling Counts and Rates
A practical guide to choosing between Poisson and negative binomial regression for count data. Learn to detect overdispersion, handle excess zeros, and interpret rate ratios correctly.

Post-Hoc Tests: Tukey, Dunnett, and Games-Howell Decision Tree
How to choose the right post-hoc test after ANOVA. Covers Tukey's HSD, Dunnett's test, Games-Howell, Scheffé, and provides a clear decision tree for selection.

Power Analysis Without Cargo Culting: Traps and Practical Heuristics
A practical guide to statistical power analysis that avoids common pitfalls. Learn when standard power calculations mislead, how to think about sample size decisions, and practical heuristics for real-world experimentation.

Practical Significance Thresholds: Defining Business Impact Before You Analyze
Learn how to set meaningful thresholds for practical significance before running experiments. Covers MDE, business context, ROI-based thresholds, and the difference between statistical and practical significance.

Pre-Analysis Checklist: Green, Yellow, and Red Flags for Analysts
A practical pre-flight checklist before running statistical analyses. Covers data quality, assumption checks, and common pitfalls that can derail your analysis.

Pre-Registration Lite for Product Experiments: A Pragmatic Workflow
A lightweight pre-registration process that works in fast-moving product teams. Document your analysis plan in 15 minutes and build credibility through transparency.

Ratio Metrics (CTR, Conversion): Why They're Tricky and Stable Alternatives
Why ratio metrics like CTR and conversion rates require special statistical treatment. Learn about variance estimation, the delta method, and when to use alternative approaches.

Regression vs. t-Test vs. ANOVA: The Unifying View (and When the Simpler Tool Suffices)
Understand how t-tests, ANOVA, and regression are all the same underlying model. Learn when to use the simpler approach and when regression's flexibility is worth it.
