FUSE: Ensembling Verifiers with Zero Labeled Data

Joonhyuk Lee, Virginia Ma, Sarah Zhao, Yash Nair, Asher Spector et al.|April 20, 2026arXiv

Key Takeaway

You can build better verification systems by combining multiple imperfect judges without any ground truth labels—FUSE shows this works as well as supervised approaches on real benchmarks.

Summary

FUSE is a method for combining multiple imperfect AI judges (verifiers) to better evaluate model outputs without needing any labeled correct answers. It uses spectral algorithms to intelligently ensemble different verifiers by controlling how they depend on each other, achieving results comparable to methods that do use labeled data.

evaluation training efficiency

Key Terms

verifier reward-model ensemble-methods spectral-methods test-time-scaling