Transformer-based Extraction of Deep Image Models

Battis, Verena; Penner, Alexander

doi:10.1109/EuroSP53844.2022.00028

2022

Conference Paper

Abstract

Model extraction attacks pose a threat to the security of ML models and to the privacy of the data used for training. Previous research has shown that such attacks can be either monetarily motivated to gain an edge over competitors or maliciously in order to mount subsequent attacks on the extracted model. In this paper, recent advances in the field of transformers are exploited to propose an attack tailored to the task of image classification that allows stealing complex convolutional neural network models without any knowledge of their architecture. The attack was performed on a range of datasets and target architectures to evaluate the robustness of the proposed attack. With only 100k queries, we were able to recover up to 99.2% of the black-box target network's accuracy on the test set. We conclude that it is possible to effectively steal complex neural networks with relatively little expertise and conventional means - even without knowledge of the target's architecture. Recently proposed defences have also been examined for their effectiveness in preventing the attack proposed in this paper.

Author(s)

Battis, Verena

Fraunhofer-Institut für Sichere Informationstechnologie SIT

Penner, Alexander

Fraunhofer-Institut für Sichere Informationstechnologie SIT

Mainwork

7th IEEE European Symposium on Security and Privacy, EuroS&P 2022

Conference

European Symposium on Security and Privacy 2022

Options

Transformer-based Extraction of Deep Image Models