The ability of an AI model to analyze and interpret visual information from images, identifying objects, scenes, and their relationships.
Quality of vision, audio, and image understanding (distinct from modality support)