Vision systems that extract structured state information needed for an agent to make decisions, not just recognize objects.