Quantum autoencoders can defend quantum classifiers against adversarial attacks by reconstructing corrupted inputs, achieving up to 68% accuracy improvement without needing adversarial training data.
This paper proposes a defense against adversarial attacks on quantum machine learning classifiers by using a quantum autoencoder to clean corrupted input data before classification. Unlike traditional defenses that require training on attack examples, this approach works without adversarial training and includes a confidence metric to flag suspicious inputs that can't be properly cleaned.