Robot policies can control execution speed by scaling action magnitudes, enabling a single model to adapt between fast and slow motions without retraining—useful for tasks requiring both speed and precision.
TempoVLA enables robots to execute manipulation tasks at variable speeds by conditioning a Vision-Language-Action model on a speed parameter. The approach uses trajectory augmentation to create training data at different speeds and adds a conditioning mechanism to the policy, allowing a single model to handle both fast transit phases and slow, precise contact phases.