ORBIT: Scalable and Verifiable Data Generation for Search Agents on a Tight Budget

Nandan Thakur, Zijian Chen, Xueguang Ma, Jimmy Lin|April 1, 2026arXiv

Key Takeaway

You can build high-quality training data for search agents using synthetic generation and verification without expensive human annotation or API costs, enabling smaller models to compete with larger ones.

Summary

ORBIT is a dataset of 20,000 reasoning-heavy questions with verifiable answers, created cheaply without paid APIs. The authors built a four-stage pipeline (seed creation, question generation, self-verification, external verification) to generate training data for search agents—AI systems that combine language models with web search.

data training agents

Key Terms

search-augmented synthetic-data grpo multi-step-reasoning verifiable-answers