Library

StatsTest Blog

Experimental design, data analysis, and statistical tooling for modern teams. No hype, just the math.

StatsTest
AssumptionsJan 26New

Normality Tests Are Overrated: Better Diagnostics and Thresholds

Why formal normality tests like Shapiro-Wilk are problematic and what to use instead. Learn practical thresholds for when non-normality actually matters.

StatsTest
ReportingJan 26New

The One-Slide Experiment Readout: Five Numbers That Matter

A template for presenting experiment results in one slide. Focus on the five numbers executives actually need to make a decision.

StatsTest
Multi-Group ComparisonsJan 26New

One-Way ANOVA: Assumptions, Effect Sizes, and Proper Reporting

A practical guide to one-way ANOVA covering assumptions, diagnostics, effect size measures (eta-squared, omega-squared), and how to report results properly.

StatsTest
Effect SizesJan 26New

P-Values vs. Confidence Intervals: How to Interpret Both for Decisions

Understand the relationship between p-values and confidence intervals, when they agree, when they seem to disagree, and how to use them together for better decisions.

StatsTest
Model EvaluationJan 26New

Paired Evaluation: McNemar's Test for Before/After Classification

When the same examples are evaluated by two models, use McNemar's test for proper inference. Learn why paired analysis is more powerful and how to implement it correctly.

StatsTest
Two-Group ComparisonsJan 26New

Paired vs. Independent Data: A Diagnostic Checklist

How to determine whether your data is paired or independent, and why getting this wrong can dramatically affect your statistical power and validity.

StatsTest
DistributionsJan 26New

Percentiles and Latency: Comparing P50, P95, P99 Correctly

How to properly compare percentile metrics like latency P50, P95, and P99 across groups. Learn about bootstrap inference, quantile regression, and the pitfalls of naive percentile comparisons.

StatsTest
RegressionJan 26New

Poisson vs. Negative Binomial: Modeling Counts and Rates

A practical guide to choosing between Poisson and negative binomial regression for count data. Learn to detect overdispersion, handle excess zeros, and interpret rate ratios correctly.

StatsTest
Multi-Group ComparisonsJan 26New

Post-Hoc Tests: Tukey, Dunnett, and Games-Howell Decision Tree

How to choose the right post-hoc test after ANOVA. Covers Tukey's HSD, Dunnett's test, Games-Howell, Scheffé, and provides a clear decision tree for selection.

StatsTest
Effect SizesJan 26New

Power Analysis Without Cargo Culting: Traps and Practical Heuristics

A practical guide to statistical power analysis that avoids common pitfalls. Learn when standard power calculations mislead, how to think about sample size decisions, and practical heuristics for real-world experimentation.

StatsTest
Effect SizesJan 26New

Practical Significance Thresholds: Defining Business Impact Before You Analyze

Learn how to set meaningful thresholds for practical significance before running experiments. Covers MDE, business context, ROI-based thresholds, and the difference between statistical and practical significance.

StatsTest
AssumptionsJan 26New

Pre-Analysis Checklist: Green, Yellow, and Red Flags for Analysts

A practical pre-flight checklist before running statistical analyses. Covers data quality, assumption checks, and common pitfalls that can derail your analysis.