• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Anderes
  4. Linking heterogeneous microstructure informatics with expert characterization knowledge through customized and hybrid vision-language representations for industrial qualification
 
  • Details
  • Full
Options
August 27, 2025
Paper (Preprint, Research Paper, Review Paper, White Paper, etc.)
Title

Linking heterogeneous microstructure informatics with expert characterization knowledge through customized and hybrid vision-language representations for industrial qualification

Title Supplement
Published on arXiv
Abstract
Rapid and reliable qualification of advanced materials remains a bottleneck in industrial manufacturing, particularly for heterogeneous structures produced via non-conventional additive manufacturing processes. This study introduces a novel framework that links microstructure informatics with a range of expert characterization knowledge using customized and hybrid vision-language representations (VLRs). By integrating deep semantic segmentation with pre-trained multi-modal models (CLIP and FLAVA), we encode both visual microstructural data and textual expert assessments into shared representations. To overcome limitations in general-purpose embeddings, we develop a customized similarity-based representation that incorporates both positive and negative references from expert-annotated images and their associated textual descriptions. This allows zero-shot classification of previously unseen microstructures through a net similarity scoring approach. Validation on an additively manufactured metal matrix composite dataset demonstrates the framework’s ability to distinguish between acceptable and defective samples across a range of characterization criteria. Comparative analysis reveals that FLAVA model offers higher visual sensitivity, while the CLIP model provides consistent alignment with the textual criteria. Z-score normalization adjusts raw unimodal and cross-modal similarity scores based on their local dataset-driven distributions, enabling more effective alignment and classification in the hybrid vision-language framework. The proposed method enhances traceability and interpretability in qualification pipelines by enabling human-in-the-loop decision-making without task-specific model retraining. By advancing semantic interoperability between raw data and expert knowledge, this work contributes toward scalable and domain-adaptable qualification strategies in engineering informatics.
Author(s)
Safdar, Mutahar
McGill University
Wood, Gentry
Zimmermann, Max Gero
Fraunhofer-Institut für Lasertechnik ILT  
Lamouche, Guy
National Research Council Canada
Wanjara, Priti
National Research Council Canada
Zhao, Yaoyao Fiona
McGill University
Open Access
File(s)
Download (9 MB)
Rights
CC BY-NC-ND 4.0: Creative Commons Attribution-NonCommercial-NoDerivatives
DOI
10.48550/arXiv.2508.20243
10.24406/publica-5560
Language
English
Fraunhofer-Institut für Lasertechnik ILT  
Keyword(s)
  • microstructure information

  • expert knowledge

  • multi-modal features

  • customized representations

  • industrial qualification

  • additive manufacturing

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024