Options
July 2024
Conference Paper
Title
Towards Trustworthy Dataset Distillation: A Benchmark of Privacy, Fairness and Robustness
Abstract
Dataset distillation is an increasingly prevalent technique for condensing a large-scale dataset into more compact versions while preserving their intrinsic utility. However, very few studies have investigated the trustworthiness of data distillation, i.e., privacy, robustness, and fairness. The deficiency is particularly striking given the existing research that underscores the vulnerabilities in current AI models, including privacy breaches, biased predictions against underrepresented subgroups, and susceptibility to imperceptible attacks. To bridge the gap, we propose a trustworthy benchmark for assessing representative dataset distillation solutions across the benchmark CIFAR10 with comprehensive evaluation metrics. Through extensive experiments, we uncover vulnerabilities inherent in the application of dataset distillation, offering valuable insights for practitioners. Our work aims to drive the development of more transparent, reliable, and responsible machine learning models, fostering AI systems that align with trustworthy principles.
Author(s)