• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Artikel
  4. On the Interplay of Subset Selection and Informed Graph Neural Networks
 
  • Details
  • Full
Options
2025
Book Article
Title

On the Interplay of Subset Selection and Informed Graph Neural Networks

Abstract
Machine learning techniques paired with the availability of massive datasets dramatically enhance our ability to explore the chemical compound space by providing fast and accurate predictions of molecular properties. However, learning on large datasets is strongly limited by the availability of computational resources and can be infeasible in some scenarios. Moreover, the instances in the datasets may not yet be labelled and generating the labels can be costly, as in the case of quantum chemistry computations. Thus, there is a need to select small training subsets from large pools of unlabeled data points and to develop reliable ML methods that can effectively learn from small training sets. This chapter focuses on predicting the molecules’ atomization energy in the QM9 dataset. We investigate the advantages of employing domain knowledge-based data sampling methods for an efficient training set selection combined with informed ML techniques. In particular, we show how maximizing molecular diversity in the training set selection process increases the robustness of linear and nonlinear regression techniques such as kernel methods and graph neural networks. We also check the reliability of the predictions made by the graph neural network with a model-agnostic explainer based on the rate-distortion explanation framework.
Author(s)
Breustedt, Niklas
Climaco, Paolo
Garcke, Jochen  
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
Hamaekers, Jan  orcid-logo
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
Kutyniok, Gitta
Lorenz, Dirk A.
Oerder, Rick Benedikt
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
Shukla, Chirag Varun
Mainwork
Informed Machine Learning  
Open Access
DOI
10.1007/978-3-031-83097-6_10
Additional link
Full text
Language
English
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024