What Am I Missing? Question-Answering as Hidden State Probing

Chu Fei Luo, Samuel Dahan, Xiaodan Zhu|May 29, 2026arXiv

Key Takeaway

LLMs can detect their own reasoning failures through self-questioning, but this diagnostic ability doesn't translate to self-correction—a critical gap for building more reliable AI systems.

Summary

This paper investigates how LLMs can diagnose their own reasoning errors by asking questions during problem-solving. Researchers train a probe to detect when a model is uncertain by analyzing its hidden states before and after question generation, then use this signal to decide when to intervene. The key finding: models can identify when they're wrong, but struggle to actually fix their mistakes.

reasoning evaluation training

Key Terms

chain-of-thought hidden-states linear-probes self-consistency inference-time-compute