• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. ViT-FIQA: Assessing Face Image Quality using Vision Transformers
 
  • Details
  • Full
Options
2025
Conference Paper not in Proceedings
Title

ViT-FIQA: Assessing Face Image Quality using Vision Transformers

Abstract
Face Image Quality Assessment (FIQA) aims to predict the utility of a face image for face recognition (FR) systems. State-of-the-art FIQA methods mainly rely on convolutional neural networks (CNNs), leaving the potential of Vision Transformer (ViT) architectures underexplored. This work proposes ViT-FIQA, a novel approach that extends standard ViT backbones, originally optimized for FR, through a learnable quality token designed to predict a scalar utility score for any given face image. The learnable quality token is concatenated with the standard image patch tokens, and the whole sequence is processed via global self-attention by the ViT encoders to aggregate contextual information across all patches. At the output of the backbone, ViT-FIQA branches into two heads: (1) the patch tokens are passed through a fully connected layer to learn discriminative face representations via a margin-penalty softmax loss, and (2) the quality token is fed into a regression head to learn to predict the face sample's utility. Extensive experiments on challenging benchmarks and several FR models, including both CNN-and ViT-based architectures, demonstrate that ViT-FIQA consistently achieves top-tier performance. These results underscore the effectiveness of transformerbased architectures in modeling face image utility and highlight the potential of ViTs as a scalable foundation for future FIQA research https://cutt.ly/irHlzXUC.
Author(s)
Atzori, Andrea
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Boutros, Fadi  orcid-logo
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Damer, Naser  
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Project(s)
Next Generation Biometric Systems  
Next Generation Biometric Systems  
Funder
Bundesministerium für Bildung und Forschung  
Hessen, Ministerium für Wissenschaft und Kunst  
Conference
International Conference on Computer Vision 2025  
Open Access
File(s)
Download (3.48 MB)
Rights
Use according to copyright law
DOI
10.24406/publica-6858
Language
English
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Keyword(s)
  • Branche: Infrastructure and Public Services

  • Research Line: Computer vision (CV)

  • Research Line: Human computer interaction (HCI)

  • Research Line: Machine learning (ML)

  • LTA: Machine intelligence, algorithms, and data structures (incl. semantics)

  • Biometrics

  • Face Recognition

  • Machine learning

  • Artificial intelligence (AI)

  • ATHENE

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024