Operadic consistency detects LLM reasoning failures by checking if a model's direct answer matches its step-by-step decomposed answer—a label-free signal that outperforms existing confidence measures across diverse models and datasets.
This paper introduces operadic consistency (OC), a method to detect when large language models fail at multi-step reasoning without needing correct answers. The key insight: a model's direct answer to a complex question should match the answer it gets by breaking down the question into steps and solving each one.