Library

StatsTest Blog

Experimental design, data analysis, and statistical tooling for modern teams. No hype, just the math.

Model EvaluationJan 26New

Model Evaluation & Human Ratings Significance for AI Products

Statistical rigor for ML/AI evaluation: comparing model performance, analyzing human ratings, detecting drift, and making defensible decisions. A comprehensive guide for AI practitioners and product teams.

statstest_flow Model Evaluation Pillar