Safe continual reinforcement learning faces a fundamental trade-off: methods that maintain safety constraints often catastrophically forget previous knowledge when environments change, and vice versa—a problem existing approaches fail to fully resolve.
This paper studies how to safely train AI controllers that adapt to changing environments over time. The authors show that existing methods struggle to both prevent safety violations and avoid forgetting previous knowledge when system dynamics shift unexpectedly.