• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. Multi-fidelity learning for atomistic models via trainable data embeddings
 
  • Details
  • Full
Options
2025
Journal Article
Title

Multi-fidelity learning for atomistic models via trainable data embeddings

Abstract
We present an approach for end-to-end training of machine learning models for structure-property modeling on collections of datasets derived using different density functional theory functionals and basis sets. This approach overcomes the problem of data inconsistencies in the training of machine learning models on atomistic data. We rephrase the underlying problem as a multi-task learning scenario. We show that conditioning neural network-based models on trainable embedding vectors can effectively account for quantitative differences between methods. This allows for joint training on multiple datasets that would otherwise be incompatible. Therefore, this procedure circumvents the need for re-computations at a unified level of theory. Numerical experiments demonstrate that training on multiple reference methods enables transfer learning between tasks, resulting in even lower errors compared to training on separate tasks alone. Furthermore, we show that this approach can be used for multi-fidelity learning, improving data efficiency for the highest fidelity by an order of magnitude. To test scalability, we train a single model on a joint dataset compiled from ten disjoint subsets of the MultiXC-QM9 dataset generated by different reference methods. Again, we observe transfer learning effects that improve the model errors by a factor of 2 compared to training on each subset alone. We extend our investigation to machine learning force fields for material simulations. To this end, we incorporate trainable embedding vectors into the readout layer of a deep graph neural network (M3GNet) that is simultaneously trained on PBE and r2SCAN labels of the MatPES dataset. We observe that joint training on both fidelity levels reduces the amount of r2SCAN data required to achieve the accuracy of a single-fidelity model by a factor of 10.
Author(s)
Oerder, Rick Benedikt
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
Schmieden, Gerrit Wilhelm
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
Hamaekers, Jan  orcid-logo
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
Journal
Machine learning: science and technology  
Open Access
File(s)
Download (1.03 MB)
Rights
CC BY 4.0: Creative Commons Attribution
DOI
10.1088/2632-2153/ae0d41
10.24406/publica-5801
Additional link
Full text
Language
English
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
Keyword(s)
  • energy prediction

  • interatomic potential

  • multi-fidelity

  • multi-task

  • quantum chemistry

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024