Extending image segmentation to video by identifying and tracking objects across multiple frames over time.
Quality of vision, audio, and image understanding (distinct from modality support)