Validating agent behavior by running code and checking if outputs match expected results, rather than relying on static analysis.