GLDM: hit molecule generation with constrained graph latent diffusion model.

Conghao Wang, Hiok Hian Ong, Shunsuke Chiba, Jagath C Rajapakse
Author Information
  1. Conghao Wang: School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore.
  2. Hiok Hian Ong: School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore.
  3. Shunsuke Chiba: School of Chemistry, Chemical Engineering and Biotechnology, Nanyang Technological University, 21 Nanyang Link, 637371, Singapore.
  4. Jagath C Rajapakse: School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore.

Abstract

Discovering hit molecules with desired biological activity in a directed manner is a promising but profound task in computer-aided drug discovery. Inspired by recent generative AI approaches, particularly Diffusion Models (DM), we propose Graph Latent Diffusion Model (GLDM)-a latent DM that preserves both the effectiveness of autoencoders of compressing complex chemical data and the DM's capabilities of generating novel molecules. Specifically, we first develop an autoencoder to encode the molecular data into low-dimensional latent representations and then train the DM on the latent space to generate molecules inducing targeted biological activity defined by gene expression profiles. Manipulating DM in the latent space rather than the input space avoids complicated operations to map molecule decomposition and reconstruction to diffusion processes, and thus improves training efficiency. Experiments show that GLDM not only achieves outstanding performances on molecular generation benchmarks, but also generates samples with optimal chemical properties and potentials to induce desired biological activity.

Keywords

References

  1. Brief Bioinform. 2018 Mar 1;19(2):277-285 [PMID: 27789427]
  2. IEEE Trans Pattern Anal Mach Intell. 2021 Dec;43(12):4217-4228 [PMID: 32012000]
  3. Nat Commun. 2020 Jan 3;11(1):10 [PMID: 31900408]
  4. J Chem Inf Model. 2019 Mar 25;59(3):1182-1196 [PMID: 30785751]
  5. Nucleic Acids Res. 2016 Jan 4;44(D1):D1045-53 [PMID: 26481362]
  6. Chem Sci. 2018 Nov 19;10(6):1692-1701 [PMID: 30842833]
  7. iScience. 2021 Mar 05;24(4):102269 [PMID: 33851095]
  8. Nucleic Acids Res. 2017 Jan 4;45(D1):D945-D954 [PMID: 27899562]
  9. NPJ Syst Biol Appl. 2016;2: [PMID: 28413689]
  10. J Cheminform. 2009 Jun 10;1(1):8 [PMID: 20298526]
  11. Cells. 2022 Mar 07;11(5): [PMID: 35269537]
  12. ACS Cent Sci. 2018 Feb 28;4(2):268-276 [PMID: 29532027]
  13. Mol Inform. 2018 Jan;37(1-2): [PMID: 29235269]
  14. J Chem Inf Model. 2018 Sep 24;58(9):1736-1741 [PMID: 30118593]
  15. Expert Opin Drug Discov. 2009 Sep;4(9):947-59 [PMID: 23480542]
  16. Front Pharmacol. 2020 Apr 17;11:269 [PMID: 32362822]
  17. J Chem Inf Model. 2019 Mar 25;59(3):1096-1108 [PMID: 30887799]
  18. Mol Pharm. 2018 Oct 1;15(10):4398-4405 [PMID: 30180591]
  19. J Chem Inf Model. 2021 Sep 27;61(9):4303-4320 [PMID: 34528432]
  20. J Cheminform. 2019 Dec 3;11(1):74 [PMID: 33430938]
  21. JAMA. 2020 Mar 03;323(9):844-853 [PMID: 32125404]
  22. J Cheminform. 2021 Jun 9;13(1):43 [PMID: 34108002]
  23. Nucleic Acids Res. 2000 Jan 1;28(1):235-42 [PMID: 10592235]
  24. J Cheminform. 2017 Mar 7;9:17 [PMID: 28316655]
  25. Mol Pharm. 2017 Sep 5;14(9):3098-3104 [PMID: 28703000]
  26. ACS Cent Sci. 2018 Jan 24;4(1):120-131 [PMID: 29392184]

Grants

  1. RG14/23/Ministry of Education

MeSH Term

Diffusion
Drug Discovery

Word Cloud

Created with Highcharts 10.0.0latentDMhitmoleculesbiologicalactivityDiffusionspacemoleculedesireddrugdiscoverygenerativeModelsGLDMchemicaldatamoleculardiffusiongenerationDiscoveringdirectedmannerpromisingprofoundtaskcomputer-aidedInspiredrecentAIapproachesparticularlyproposeGraphLatentModel-apreserveseffectivenessautoencoderscompressingcomplexDM'scapabilitiesgeneratingnovelSpecificallyfirstdevelopautoencoderencodelow-dimensionalrepresentationstraingenerateinducingtargeteddefinedgeneexpressionprofilesManipulatingratherinputavoidscomplicatedoperationsmapdecompositionreconstructionprocessesthusimprovestrainingefficiencyExperimentsshowachievesoutstandingperformancesbenchmarksalsogeneratessamplesoptimalpropertiespotentialsinduceGLDM:constrainedgraphmodeldeepmodelsdesign

Similar Articles

Cited By