Quantifying Faithful Confidence Expression in Large Reasoning Models

Areeb Gani, Asal Meskin, Gabrielle Kaili-May Liu, Arman Cohan|June 2, 2026arXiv

Key Takeaway

Large reasoning models frequently express confidence that doesn't match their actual uncertainty—a critical problem for deployment in high-stakes applications that current evaluation methods fail to capture.

Summary

This paper introduces a framework to measure whether large reasoning models (LRMs) accurately express their internal confidence through language. The researchers find that reasoning models often claim confidence they don't actually have, and that existing methods for measuring this problem don't work well with long reasoning traces.

evaluation reasoning alignment

Key Terms

calibration reasoning-trace token-probability chain-of-thought uncertainty-quantification