AI • Oct 6, 2025

Measuring Agents With Interactive Evaluations

Name: Measuring Agents With Interactive Evaluations
Uploaded: 2025-10-06
Description: Agents explore, plan, and reliably execute across diverse, long-horizon tasks—challenges that static benchmarks can’t measure. Hear from Greg Kamradt, President …

OpenAI DevDay 2025 Conference Collection

Agents explore, plan, and reliably execute across diverse, long-horizon tasks—challenges that static benchmarks can’t measure. Hear from Greg Kamradt, President of the ARC Prize Foundation, on how evaluating agentic performance requires interactive evaluations.

#Agents