The Unfireable Safety Kernel: Execution-Time AI Alignment for AI Agents and Other Escapable AI Systems

Seth Dobrin, Łukasz Chmiel|June 24, 2026arXiv

Key Takeaway

AI safety controls embedded in an agent's own code can be bypassed; instead, safety enforcement should run in a separate process with formal verification, acting as an external referee that agents cannot manipulate.

Summary

This paper proposes the Unfireable Safety Kernel, a system that enforces AI safety constraints at the execution level—outside the AI agent's own code—rather than relying on internal safeguards.

safety agents alignment

Key Terms

execution-time-compute escapable-ai-systems process-separation fail-closed formal-verification