Training a policy by having humans intervene and correct the robot, then learning from those corrections.