Vision Transformers can be made adversarially robust through standard adversarial training, and surprisingly, overfitting doesn't necessarily hurt robustness if the signal-to-noise ratio is favorable—a finding that challenges conventional wisdom about the robustness-generalization tradeoff.
This paper provides the first theoretical analysis of adversarial training in Vision Transformers, showing that under certain conditions, ViTs can achieve strong robustness against adversarial attacks even when overfitting occurs.