Learning Coordinated Preference for Multi-Objective Multi-Agent Reinforcement Learning

Pengxin Wang, Lihao Guo, Yi Xie, Bo Liu, Siyang Cao et al.|June 12, 2026arXiv

Key Takeaway

Allowing different agents to optimize for different objective trade-offs—rather than forcing all agents to use the same preferences—improves both individual performance and team coordination in multi-objective cooperative settings.

Summary

This paper tackles multi-objective multi-agent reinforcement learning where teams must balance multiple conflicting goals while coordinating across agents with different roles. The authors propose PCMA, which learns different preference weights for each agent to enable better trade-offs between objectives and improve overall team performance.

reasoning training

Key Terms

multi-objective-reinforcement-learning multi-agent-coordination preference-optimization cooperative-game