Options
2023
Master Thesis
Title
Identity Synthesis via Latent Diffusion Models for Accurate Face Recognition
Other Title
Identitätssynthese mithilfe von Latent Diffusion Modellen für Akkurate Gesichtserkennung
Abstract
The availability of large-scale identity-labelled databases has been crucial to the significant advances made in face recognition (FR) research in the past decade. However, recent legal and ethical concerns have led to the retraction of many of these databases, raising questions about how FR research can continue without one of its key resources. Synthetic data generated by deep generative models has emerged as a promising alternative, but current methods based on generative adversarial networks suffer either from limitations in intra-class diversity or identity-separability. This thesis proposes IDiff-Face, a new approach that applies conditional diffusion models to identity-based synthetic data generation for FR. Within that framework, the identity representations extracted with a pre-trained face recognition model are used as conditioning vectors for the generative process. During sampling, synthetically created identity representations can be used to generate images of synthetic identities, or face embeddings of real images can be used to generate diverse variations. Previous GAN-based approaches either suffered from a low intra-class variation, or a low separability of the generated identities, limiting the performance of FR models trained on their synthetic datasets. Unlike these approaches, the presented IDiff-Face offers a more realistic trade-off between identity-separability and intra-class variation that is controllable at training-time through a contextual partial dropout (CPD) mechanism. Achieving a realistic balance between these two properties is necessary to reach verification accuracies that are close to the verification accuracies achieved by FR models trained on real data. In a fair comparison to the related work, the synthetic dataset (500K images) generated by IDiff-Face clearly sets a new state-of-the-art with an average of 88.20% (previous state-of-the-art was 83.45%) over several benchmarks, including 98.00% on LFW, 86.43% on AgeDB-30, 90.65% on CA-LFW, 85.47% on CFP-FP, and 80.45% on CP-LFW. Thereby, it closes the gap to SOTA FR models trained on real data, where the average accuracies on CASIA-WebFace (500K images) and MS1MV2 (5.8M images) training datasets are 94.92% and 97.18%, respectively. Moreover, it also outperforms human-level performance in face verification on LFW, where the reported accuracy has been 97.5%. In summary, IDiff-Face provides a simple, yet effective solution for generating synthetic data for FR that can mitigate the privacy concerns associated with real databases. The proposed approach achieves new state-of-the-art results and provides valuable insights for future research in this field.
Thesis Note
Darmstadt, TU, Master Thesis, 2023