• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. ViT-FIQA: Assessing Face Image Quality using Vision Transformers
 
  • Details
  • Full
Options
2025
Conference Paper not in Proceedings
Title

ViT-FIQA: Assessing Face Image Quality using Vision Transformers

Abstract
Face Image Quality Assessment (FIQA) aims to predict the utility of a face image for face recognition (FR) systems. State-of-the-art FIQA methods mainly rely on convolutional neural networks (CNNs), leaving the potential of Vision Transformer (ViT) architectures underexplored. This work proposes ViT-FIQA, a novel approach that extends standard ViT backbones, originally optimized for FR, through a learnable quality token designed to predict a scalar utility score for any given face image. The learnable quality token is concatenated with the standard image patch tokens, and the whole sequence is processed via global self-attention by the ViT encoders to aggregate contextual information across all patches. At the output of the backbone, ViT-FIQA branches into two heads: (1) the patch tokens are passed through a fully connected layer to learn discriminative face representations via a margin-penalty softmax loss, and (2) the quality token is fed into a regression head to learn to predict the face sample's utility. Extensive experiments on challenging benchmarks and several FR models, including both CNN-and ViT-based architectures, demonstrate that ViT-FIQA consistently achieves top-tier performance. These results underscore the effectiveness of transformerbased architectures in modeling face image utility and highlight the potential of ViTs as a scalable foundation for future FIQA research https://cutt.ly/irHlzXUC.
Author(s)
Atzori, Andrea
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Boutros, Fadi  orcid-logo
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Damer, Naser  
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Project(s)
Next Generation Biometric Systems  
Next Generation Biometric Systems  
Funder
Bundesministerium für Bildung und Forschung -BMBF-  
Hessisches Ministerium für Wissenschaft und Kunst -HMWK-  
Conference
International Conference on Computer Vision 2025  
Open Access
File(s)
Download (3.48 MB)
Rights
Use according to copyright law
DOI
10.24406/publica-6858
Language
English
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Keyword(s)
  • Branche: Infrastructure and Public Services

  • Research Line: Computer vision (CV)

  • Research Line: Human computer interaction (HCI)

  • Research Line: Machine learning (ML)

  • LTA: Machine intelligence, algorithms, and data structures (incl. semantics)

  • Biometrics

  • Face Recognition

  • Machine learning

  • Artificial intelligence (AI)

  • ATHENE

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024