You can reverse-engineer an agent's decision logic from its behavior by combining observation with strategic experimentation—a technique that works for policy interpretability and opponent modeling in competitive settings.
RevengeBench is a benchmark for reconstructing hidden decision-making code from an agent's behavior in games. Researchers observe a hidden policy playing and can design custom opponents to probe its behavior, then submit executable code that mimics it.