Safe Continual Reinforcement Learning in Non-stationary Environments

Austin Coursey, Abel Diaz-Gonzalez, Marcos Quinones-Grueiro, Gautam Biswas|April 21, 2026arXiv

Key Takeaway

Safe continual reinforcement learning faces a fundamental trade-off: methods that maintain safety constraints often catastrophically forget previous knowledge when environments change, and vice versa—a problem existing approaches fail to fully resolve.

Summary

This paper studies how to safely train AI controllers that adapt to changing environments over time. The authors show that existing methods struggle to both prevent safety violations and avoid forgetting previous knowledge when system dynamics shift unexpectedly.

safety training reasoning

Key Terms

reinforcement-learning catastrophic-forgetting continual-learning non-stationary-dynamics safety-constraints