• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Anderes
  4. High throughput tight binding calculation of electronic HOMO-LUMO gaps and its prediction for natural compounds
 
  • Details
  • Full
Options
April 10, 2025
Paper (Preprint, Research Paper, Review Paper, White Paper, etc.)
Title

High throughput tight binding calculation of electronic HOMO-LUMO gaps and its prediction for natural compounds

Title Supplement
Published on ChemRxiv, 10 April 2025, Version 2
Abstract
This research investigates predicting the HOMO-LUMO (HL) gap of natural compounds, a crucial property for understanding molecular electronic behavior relevant to cheminformatics and material science. Addressing the computational expense of traditional methods, this study develops a high-throughput, machine learning-based approach. Using 407,000 molecules from the COCONUT database, RDKit was employed to calculate and select molecular descriptors. The computational workflow, managed by Toil and CWL on a high-performance computing Slurm cluster, utilized xTB for electronic structure calculations with Boltzmann weighting across multiple conformational states. Gradient boosting regression (GBR) and a Multi-layer Perceptron regressor (MLPR) were compared based on their ability to accurately predict HL-gaps in this chemical space. Key findings reveal molecular polarizability, particularly SMR_VSA descriptors, as crucial for HL-gap determination in both models. Aromatic rings and functional groups, such as ketones, also significantly influence the HL-gap prediction. While the MLPR model demonstrated good overall predictive performance, accuracy varied across molecular subsets. Challenges were observed in predicting HL-gaps for molecules containing aliphatic carboxylic acids, alcohols, and amines in molecular systems with complex electronic structure. This work emphasizes the importance of polarizability and structural features in HL-gap predictive modeling, showcasing the potential of machine learning while also highlighting limitations in handling specific structural motifs. These limitations point towards promising perspectives for further model improvements.
Author(s)
Thinius, Sascha
Fraunhofer-Institut für Fertigungstechnik und Angewandte Materialforschung IFAM  
Open Access
DOI
10.26434/chemrxiv-2025-c51jx-v2
Additional link
Full text
Language
English
Fraunhofer-Institut für Fertigungstechnik und Angewandte Materialforschung IFAM  
Keyword(s)
  • workflow

  • CWL

  • tight binding

  • band gap

  • cheminformatics

  • cheminformatics

  • machine learning

  • regression

  • data science

  • interoperability

  • reusability

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024