By explicitly modeling object state changes as a learnable manifold, WorldString provides a unified way to represent how objects respond to actions—bridging the gap between perception and control for physical world models.
WorldString is a neural architecture that learns to represent how real-world objects change state over time by processing point clouds or video data. It creates a digital twin of objects that captures their actionable properties, serving as a building block for world models that can predict and interact with the physical world.