Options
2024
Paper (Preprint, Research Paper, Review Paper, White Paper, etc.)
Title
Adversarial Examples are Misaligned in Diffusion Model Manifolds
Title Supplement
Published on arXiv
Abstract
In recent years, diffusion models (DMs) have drawn significant attention fortheir success in approximating data distributions, yielding state-of-the-artgenerative results. Nevertheless, the versatility of these models extendsbeyond their generative capabilities to encompass various vision applications,such as image inpainting, segmentation, adversarial robustness, among others.This study is dedicated to the investigation of adversarial attacks through thelens of diffusion models. However, our objective does not involve enhancing theadversarial robustness of image classifiers. Instead, our focus lies inutilizing the diffusion model to detect and analyze the anomalies introduced bythese attacks on images. To that end, we systematically examine the alignmentof the distributions of adversarial examples when subjected to the process oftransformation using diffusion models. The efficacy of this approach isassessed across CIFAR-10 and ImageNet datasets, including varying image sizesin the latter. The results demonstrate a notable capacity to discriminateeffectively between benign and attacked images, providing compelling evidencethat adversarial instances do not align with the learned manifold of the DMs.