Systematic examination of model failures to identify patterns and root causes beyond aggregate metrics.