You can build better verification systems by combining multiple imperfect judges without any ground truth labels—FUSE shows this works as well as supervised approaches on real benchmarks.
FUSE is a method for combining multiple imperfect AI judges (verifiers) to better evaluate model outputs without needing any labeled correct answers. It uses spectral algorithms to intelligently ensemble different verifiers by controlling how they depend on each other, achieving results comparable to methods that do use labeled data.