Adversarial text placed inside images that misleads models into focusing on lexical meaning instead of visual content.