DA3G: Detecting Adversarial Attacks by Analysing Gradients

Schulze, J.-P.; Sperl, P.; Böttinger, K.

doi:10.1007/978-3-030-88418-5_27

2021

Conference Paper

Abstract

Deep learning models are vulnerable to specifically crafted inputs, called adversarial examples. In this paper, we present DA3G, a novel method to reliably detect evasion attacks on neural networks. We analyse the behaviour of the network under test on the given input sample. Compared to the benign training data, adversarial examples cause a discrepancy between visual and causal perception. Although visually close to a benign input class, the output is shifted at the attacker's will. DA3G detects these changes in the pattern of the gradient using an auxiliary neural network. Our end-to-end approach readily integrates with a variety of existing architectures. DA3G reliably detects known as well as unknown attacks and increases the difficulty of adaptive attacks.

Author(s)

Schulze, J.-P.

Sperl, P.

Böttinger, K.

Mainwork

Computer Security- ESORICS 2021. 26th European Symposium on Research in Computer Security. Proceedings. Pt.I

Conference

European Symposium on Research in Computer Security (ESORICS) 2021

Options

DA3G: Detecting Adversarial Attacks by Analysing Gradients