You can use LLMs with formal verification to automatically synthesize safety rules from human goals, catching errors before deployment—reducing the gap between what we want AI to do and what it actually does.
This paper presents a system that automatically creates and verifies safety rules for AI systems by combining language models, formal logic, and causal reasoning. It takes high-level goals from humans (like "avoid collisions") and converts them into formal logical rules that can be checked for correctness, tested in autonomous driving scenarios.