Options
2021
Conference Paper
Title
Towards Privacy-Preserving Classification-as-a-Service for DGA Detection
Abstract
Domain generation algorithm (DGA) classifiers can be used to detect and block the establishment of a connection between bots and their command-and-control server. Classification-as-a-service (CaaS) can separate the classification of domain names from the need for real-world training data, which are difficult to obtain but mandatory for well performing classifiers. However, domain names as well as trained models may contain privacy-critical information which should not be leaked to either the model provider or the data provider. Several generic frameworks for privacy-preserving machine learning (ML) have been proposed in the past that can preserve data and model privacy. Thus, it seems high time to combine state-of-the-art DGA classifiers and privacy-preservation frameworks to enable privacy-preserving CaaS, preserving both, data and model privacy for the DGA detection use case. In this work, we examine the real-world applicability of four generic frameworks for privacy-preserving ML using different state-of-the-art DGA detection models. Our results show that out-of-the-box DGA detection models are computationally infeasible for privacy-preserving inference in a real-world setting. We propose model simplifications that achieve a reduction in inference latency of up to 95%, and up to 97% in communication complexity while causing an accuracy penalty of less than 0.17%. Despite this significant improvement, real-time classification is still not feasible in a traditional two-party setting. Thus, more efficient secure multi-party computation (SMPC) or homomorphic encryption (HE) schemes are required to enable real-world feasibility of privacy-preserving CaaS for DGA detection.
Author(s)