Why AI Evals Are Changing and What It Means for Open Source
Insight: The Eval Crisis and the New Frontier for Open Source The latest digest of videos reveals a critical shift in the AI landscape: the old benchmarks are breaking. OpenAI’s Tejal Patwardhan explains that their frontier evals team must constantly invent new tests because models like o1 have outpaced existing measures. This isn’t just a … Read more