Options
2022
Master Thesis
Title
On the Effect of Synthetic Data Generation on Face Recognition Performance
Other Title
Der Einfluss synthetischer Datengenerierung auf die Leistung von Gesichtserkennung
Abstract
Due to the usage of large-scale databases as training sets for deep convolutional neural networks, the performance of face recognition systems has improved rapidly over the last few years. However, there are concerns regarding the privacy rights of people whose images are part of these databases. The European Union’s General Data Protection Regulation has emphasized these concerns. The question arises whether synthetic data can be utilized for face recognition training. Among other factors, it remains uncertain whether identities, which are depicted in synthetic images, are truly separate from real identities in the training databases of the image generation models. This thesis proposes the generation of labeled facial images with a focus on class separability by extending the GAN training process with a classification model which supports the generator to generate images of separate synthetic class labels. The classification model’s feature extractor is pretrained on real data and domain adaptation is introduced as additional learning objective for the generator. An extensive evaluation is conducted concerning class separability and identity preservation between real facial image data and synthetic datasets. It is found that, to a degree, synthetic images from state-of-the-art face generation architectures do not preserve the identities which are portrayed in the real images. The proposed architecture improves class separability of generated datasets. The Fisher Discriminant Ratio, a criterion for class separability, increases from 1.109 for a synthetic dataset from the face generation architecture StyleGAN2-ADA to 3.763 for a synthetic dataset from the proposed approach. Furthermore, this work investigates the influence of class separable synthetic training datasets on face recognition performance. The analysis shows that the usage of training datasets with increased class separability and attention to the trade-off with intra-class variation can lead to better performing face recognition models.
Thesis Note
Darmstadt, TU, Master Thesis, 2022