Hybrid sensor fusion (IMUs + egocentric vision) enables robust, portable human motion capture in uncontrolled environments—critical for scaling robot learning with real-world human demonstrations.
RoSHI is a wearable system that combines IMU sensors with AR glasses to capture full-body human motion in real-world settings. By fusing inertial measurements with egocentric camera data, it creates accurate 3D pose estimates that work even when body parts are hidden or moving fast, making it practical for collecting robot training data from human activities.