Structured outputs like bounding boxes or coordinates that localize errors, rather than natural language explanations.