Efficient and Sound Probabilistic Verification for AI Agents

Alaia Solko-Breslin, Pramod Kaushik Mudrakarta, Mihai Christodorescu, Somesh Jha, Krishnamurthy Dj Dvijotham|June 18, 2026arXiv

Key Takeaway

You can now formally verify AI agent security policies with probabilistic components (like imperfect detectors) and get mathematical guarantees on violation rates, even when you don't know how errors correlate.

Summary

This paper presents a framework for verifying that AI agents follow security policies even when using unreliable components like PII detectors or classifiers that sometimes fail. Unlike existing approaches that assume perfect detection, this method computes guaranteed upper bounds on policy violations using robust optimization, without requiring assumptions about how errors correlate.

safety agents evaluation

Key Terms

runtime-monitoring distributionally-robust-optimization formal-verification policy-violation