Explanation consistency—whether a model uses the same reasoning for similar cases—is a separate and important signal from accuracy, and can warn of model problems before they show up in standard performance metrics.
This paper introduces the C-Score, a new metric that measures whether medical AI models explain their decisions consistently across similar patients, rather than just whether explanations match radiologist annotations. The authors test six explanation methods across three neural networks and find that explanation consistency can predict model instability before accuracy drops.