A technique that uses spatial information to guide which parts of a video frame correspond to which agent or subject.