Systematic data curation matters more than you might think—the right mix of task sources and diversity in training data significantly improves how well agents generalize across different benchmarks.
This paper presents OpenThoughts-Agent, an open framework for creating training data for AI agents that can handle diverse tasks. The authors ran 100+ experiments to understand what makes good training data, then created a 100K example dataset that improved agent performance by 3.9 percentage points over existing open models.