Experimental design, data analysis, and statistical tooling for modern teams. No hype, just the math.
Filters
Browse topics
Focus
How to measure agreement between human raters for AI evaluation. Learn when to use Cohen's Kappa vs. Krippendorff's Alpha, how to interpret values, and what to do when agreement is low.