For multi-agent competitive problems like auctions, using a solver-in-the-loop approach with pairwise payoff approximations lets you train agents that play near-equilibrium strategies at a fraction of the computational cost of exact game-theoretic solutions.
DNQ trains agents to bid competitively in multi-player auctions by alternating between collecting bidding trajectories, estimating payoffs with a shared neural network, computing equilibrium strategies with a solver, and training agents to imitate those equilibria.