Training AI systems with multiple agents through self-play creates more robust and safer real-world behavior than traditional single-agent approaches, because agents must learn to anticipate and coordinate with others rather than treating them as noise.
This paper shows that multi-agent reinforcement learning makes autonomous systems safer and more capable in real-world shared spaces.