Most prompt injection defenses are weaker than claimed—they fail to generalize across tasks and break down against adaptive attacks, highlighting the need for more robust security approaches.
PIArena is a unified platform for testing prompt injection attacks and defenses in AI systems. It reveals that current defenses have serious weaknesses: they don't work well across different tasks, fail against adaptive attacks, and struggle when injected instructions align with the model's original purpose.