Self-Improving Language Models with Bidirectional Evolutionary Search

Guowei Xu, Zhenting Qi, Huangyuan Su, Weirui Ye, Himabindu Lakkaraju et al.|May 27, 2026arXiv

Key Takeaway

BES combines evolutionary operators with task decomposition to escape the limitations of autoregressive-only search, enabling language models to find better solutions during both training and inference on challenging reasoning tasks.

Summary

This paper introduces Bidirectional Evolutionary Search (BES), a method that improves language models by combining forward search (generating new solutions by mixing partial trajectories) with backward search (breaking tasks into verifiable subtasks).

reasoning training

Key Terms

best-of-n-sampling tree-search autoregressive-expansion sub-question-decomposition evolutionary-search