Self-Evaluation Is Already There: Eliciting Latent Judge Calibration in Base LLMs with Minimal Data

XiuYu Zhang, Yi Shan, Junfeng Fang, Zhenkai Liang|June 3, 2026arXiv

Key Takeaway

LLMs possess an inherent ability to self-evaluate against external judges that can be efficiently unlocked with minimal training data, suggesting self-evaluation is about revealing existing knowledge rather than teaching new skills.

Summary

This paper shows that base language models already have a hidden ability to predict how external judges will score their outputs. The authors introduce SEE, a training method that surfaces this latent skill using just 160 examples—31x fewer than standard approaches—by combining reinforcement learning with distillation to improve both answer quality and calibration accuracy.

training evaluation alignment

Key Terms

calibration reinforcement-learning distillation few-shot-learning latent-ability