Vision Understanding

architecture

The ability of an AI model to analyze and interpret visual information from images, identifying objects, scenes, and their relationships.

Related Capabilities

Quality of vision, audio, and image understanding (distinct from modality support)