Phantasia: Context-Adaptive Backdoors in Vision Language Models

Nam Duong Tran, Phi Le Nguyen|April 9, 2026arXiv

Key Takeaway

Backdoor attacks on multimodal AI models can be made significantly stealthier by generating context-aware poisoned outputs rather than fixed patterns—a critical finding for securing VLMs in production.

Summary

This paper reveals that existing backdoor attacks on Vision-Language Models are easier to detect than previously thought, and introduces Phantasia, a new attack that generates contextually appropriate malicious responses instead of fixed patterns, making it much harder to spot while maintaining normal performance.

safety multimodal

Key Terms

backdoor-attack vision-language-models poisoned-responses context-adaptive