Automated environment synthesis and trajectory generation can reduce the data requirements for tool-use agent training by 5x while improving downstream performance, making agentic RL more practical and scalable.
EnvFactory automates the creation of tool-use training environments and realistic multi-turn interaction trajectories for teaching language models to use tools effectively. It generates diverse, natural training data from verified executable environments, enabling more efficient agent training with fewer resources than existing approaches.