Library

StatsTest Blog

Experimental design, data analysis, and statistical tooling for modern teams. No hype, just the math.

ReportingJan 26New

Experiment Guardrails: Stopping Rules, Ramp Criteria, and Managing Risk

Protect your experiments and users with proper guardrails. Learn when to stop an experiment, how to safely ramp exposure, and what metrics should trigger automatic rollback.

statstest_flow Reporting Supporting

RegressionJan 26New

Feature Scaling and Transforms: When Preprocessing Changes the Story

A practical guide to standardization, centering, and transformations in regression. Learn when scaling affects interpretation, when it's required, and how to interpret coefficients on transformed variables.

statstest_flow Regression Supporting

Two-Group ComparisonsJan 26New

Handling Outliers: Trimmed Means, Winsorization, and Robust Methods

How to analyze data with outliers without throwing away information or letting extreme values dominate. Covers trimming, winsorization, robust estimators, and when each is appropriate.

statstest_flow Two-Group Comparisons Supporting

Survival AnalysisJan 26New

Hazard Ratio Interpretation for Product Teams: When NOT to Use It

A practical guide to interpreting hazard ratios for non-statisticians. Learn what hazard ratios actually mean, common misinterpretations, when they're misleading, and better alternatives for communicating survival results.

statstest_flow Survival Analysis Supporting

Multi-Group ComparisonsJan 26New

Heteroskedastic Groups: When Variances Differ and What to Do About It

How to handle multi-group comparisons when variances are unequal. Covers Welch's ANOVA, Games-Howell post-hoc, and why this matters more than non-normality.

statstest_flow Multi-Group Comparisons Supporting

AssumptionsJan 26New

Independence: The Silent Killer of Statistical Validity

The independence assumption is the most critical and most commonly violated. Learn to detect non-independence from repeated measures, clustering, and time series—and what to do about it.

statstest_flow Assumptions Supporting

Model EvaluationJan 26New

Inter-Rater Reliability: Cohen's Kappa and Krippendorff's Alpha

How to measure agreement between human raters for AI evaluation. Learn when to use Cohen's Kappa vs. Krippendorff's Alpha, how to interpret values, and what to do when agreement is low.

statstest_flow Model Evaluation Supporting

RegressionJan 26New

Interaction Terms: When Treatment Effects Vary by Segment

A practical guide to interaction effects in regression. Learn when to include interactions, how to interpret them correctly, and common pitfalls when testing whether treatment effects differ across segments.

statstest_flow Regression Supporting

Survival AnalysisJan 26New

Kaplan-Meier Curves for Retention: How to Read and Explain Them

A practical guide to Kaplan-Meier survival curves for product retention analysis. Learn to create, interpret, and explain retention curves to stakeholders, with handling for censoring and confidence intervals.

statstest_flow Survival Analysis Supporting

Multi-Group ComparisonsJan 26New

Kruskal-Wallis Test: When It's Appropriate and Post-Hoc Strategy

Understanding the Kruskal-Wallis test for comparing multiple groups without normality assumptions. Covers what it actually tests, when to use it, and how to follow up with Dunn's test.

statstest_flow Multi-Group Comparisons Supporting

RegressionJan 26New

Linear Regression Assumptions and Diagnostics in Practice

A practical guide to checking linear regression assumptions with diagnostic plots. Learn what violations actually look like, when they matter, and what to do when assumptions fail.

statstest_flow Regression Supporting

Survival AnalysisJan 26New

Log-Rank Test: When It's Appropriate and Common Misuses

A practical guide to the log-rank test for comparing survival curves. Learn when it works, when it fails, and better alternatives when proportional hazards don't hold.

statstest_flow Survival Analysis Supporting