Options
2021
Conference Paper
Title
DA3G: Detecting Adversarial Attacks by Analysing Gradients
Abstract
Deep learning models are vulnerable to specifically crafted inputs, called adversarial examples. In this paper, we present DA3G, a novel method to reliably detect evasion attacks on neural networks. We analyse the behaviour of the network under test on the given input sample. Compared to the benign training data, adversarial examples cause a discrepancy between visual and causal perception. Although visually close to a benign input class, the output is shifted at the attacker's will. DA3G detects these changes in the pattern of the gradient using an auxiliary neural network. Our end-to-end approach readily integrates with a variety of existing architectures. DA3G reliably detects known as well as unknown attacks and increases the difficulty of adaptive attacks.