Signals Are All You Need: Detecting and Mitigating Digital and Real-World Adversarial Patches Using Signal-Based Features

Bunzel, Niklas; Frick, Raphael Antonius; Klause, Gerrit; Schwarte, Aino; Honermann, Jonas

doi:10.1145/3665451.3665530

2024

Conference Paper

Abstract

In recent times, neural networks have found their way into various applications and processes, including image classification, object detection for self-driving cars, and face recognition systems used for biometric verification and surveillance. However, even the most advanced object detectors remain susceptible to adversarial patch attacks — small distortions that can be digitally inserted into images or physically placed in the real world. These attacks can cause detectors to miss actual objects, detect non-existent ones, or predict incorrect object classes. Given their high confidence, these adversarial attacks pose a significant threat to the trustworthiness of AI-enabled systems. In this paper, we propose a novel detection approach for digital and real-world adversarial patches based on the analyses of handcrafted features derived from signal processing. We developed two versions of the algorithm: one approach is using Error Level Analysis, while the other is taking advantage of Haralick’s texture features. By applying a Chan-Vese-based segmentation, regions potentially encompassing the adversarial patches can be identified. Image inpainting techniques based on signal-processing and diffusion models can then be used to remove the patches so that the model can produce the correct prediction output. We evaluated our approaches on various types of adversarial patches, i.e., real-world, textured digital, and smooth digital adversarial patches, as well as on classifiers trying to solve a multitude of tasks. When evaluating images featuring digital adversarial patches, we based our experiments on a subset of ImageNet and ImageNet-patch data sets, as well as on a subset of LFW and meaningful adversarial stickers. Utilizing the Error Level Analysis technique, we achieved accuracies between 93% and 86%. Meanwhile, the texture analysis method yielded accuracies of 86% and 67%, respectively. When considering real-world scenarios, our analysis was expanded to include the APRICOT and MS COCO datasets. Here, the ELA-based approach achieved an accuracy of 80%. The GLCM-based approach demonstrated a slightly higher accuracy of 81%. This indicates that both methods have practical applicability, with the GLCM-based approach showing a slight edge in real-world dataset performance.