• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Artikel
  4. Llamol: a dynamic multi-conditional generative transformer for de novo molecular design
 
  • Details
  • Full
Options
June 21, 2024
Journal Article
Title

Llamol: a dynamic multi-conditional generative transformer for de novo molecular design

Abstract
Generative models have demonstrated substantial promise in Natural Language Processing (NLP) and have found application in designing molecules, as seen in General Pretrained Transformer (GPT) models. In our efforts to develop such a tool for exploring the organic chemical space in search of potentially electro-active compounds, we present Llamol, a single novel generative transformer model based on the Llama 2 architecture, which was trained on a 12.5M superset of organic compounds drawn from diverse public sources. To allow for a maximum flexibility in usage and robustness in view of potentially incomplete data, we introduce Stochastic Context Learning (SCL) as a new training procedure. We demonstrate that the resulting model adeptly handles single- and multi-conditional organic molecule generation with up to four conditions, yet more are possible. The model generates valid molecular structures in SMILES notation while flexibly incorporating three numerical and/or one token sequence into the generative process, just as requested. The generated compounds are very satisfactory in all scenarios tested. In detail, we showcase the model’s capability to utilize token sequences for conditioning, either individually or in combination with numerical properties, making Llamol a potent tool for de novo molecule design, easily expandable with new properties.
Author(s)
Dobberstein, Niklas
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
Maaß, Astrid  orcid-logo
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
Hamaekers, Jan  orcid-logo
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
Journal
Journal of cheminformatics. Online journal  
Open Access
DOI
10.1186/s13321-024-00863-8
Language
English
Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024