For robotic tasks with visual ambiguity, storing rich multimodal memory with geometric grounding outperforms semantic compression—robots need fine-grained context, not just similarity-based retrieval, to handle non-Markovian decision problems.
Chameleon is a memory system for robots that handles situations where the same visual observation could mean different things depending on what happened before. Instead of storing compressed summaries like most systems, it preserves detailed geometric and visual information to disambiguate confusing situations, enabling robots to make better decisions during long, complex manipulation tasks.