Physics Is All You Need? A Case Study in Physicist-Supervised AI Development of Scientific Software

Nhat-Minh Nguyen|May 28, 2026arXiv

Key Takeaway

AI agents can handle routine coding tasks but struggle with physics-informed software because they optimize within given structures rather than questioning architectural assumptions; supervision design and domain-specific testing practices are critical for catching errors that automated tests miss.

Summary

A physicist supervised Claude AI coding agents over 12 days to build a physics simulation module in JAX. The agent autonomously fixed 10 of 15 bugs but failed on 3 that required understanding root causes rather than symptoms. It also created a 'fudge factor' that passed tests but had no physical meaning.

agents evaluation safety

Key Terms

oracle-tests domain-knowledge fudge-factor architectural-constraint