Regret Minimization with Adaptive Opponents in Repeated Games

Mingyang Liu, Asuman Ozdaglar, Tiancheng Yu, Kaiqing Zhang|June 4, 2026arXiv

Key Takeaway

RP-Regret is a game-theoretic regret metric designed for repeated games with adaptive opponents that enables finding better equilibria than standard regret minimization approaches, with provable algorithms for non-convex optimization.

Summary

This paper introduces Repeated Policy Regret (RP-Regret), a new way to measure how well a player performs in repeated games against opponents who adapt to past moves. Unlike standard regret metrics, RP-Regret accounts for what players could have achieved if they'd responded differently to the game history.

reasoning agents training

Key Terms

regret repeated-games adaptive-opponents counterfactual-reasoning subgame-perfect-equilibrium