A shared mathematical space where different types of data (like sounds and text descriptions) are represented so similar concepts are positioned close together, enabling direct comparison.
Quality of vision, audio, and image understanding (distinct from modality support)