You can generate unlimited diverse reasoning training environments by recursively composing smaller, reusable building blocks—reducing manual environment creation from 300 tasks to 50 while maintaining or improving model performance.
This paper introduces RACES, a framework that treats verifiable environments (structured tasks for training reasoning) as composable building blocks that can be automatically combined.