Different Nash equilibrium solvers systematically select different equilibria based on their algorithm design—regularized methods pick maximum-entropy solutions while regret-averaging methods pick lower-entropy ones—which matters for robustness against imperfect opponents.
This paper investigates how different algorithms for solving two-player zero-sum games select different Nash equilibria from the convex set of possible equilibria.