cheML.io: an online database of ML-generated molecules.

Rustam Zhumagambetov, Daniyar Kazbek, Mansur Shakipov, Daulet Maksut, Vsevolod A Peshkov, Siamac Fazli
Author Information
  1. Rustam Zhumagambetov: Department of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University Nur-Sultan Kazakhstan siamac.fazli@nu.edu.kz. ORCID
  2. Daniyar Kazbek: Department of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University Nur-Sultan Kazakhstan siamac.fazli@nu.edu.kz.
  3. Mansur Shakipov: Department of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University Nur-Sultan Kazakhstan siamac.fazli@nu.edu.kz.
  4. Daulet Maksut: Department of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University Nur-Sultan Kazakhstan siamac.fazli@nu.edu.kz.
  5. Vsevolod A Peshkov: Department of Chemistry, School of Sciences and Humanities, Nazarbayev University Nur-Sultan Kazakhstan vsevolod.peshkov@nu.edu.kz. ORCID
  6. Siamac Fazli: Department of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University Nur-Sultan Kazakhstan siamac.fazli@nu.edu.kz. ORCID

Abstract

Several recent ML algorithms for molecule generation have been utilized to create an open-access database of virtual molecules. The algorithms were trained on samples from ZINC, a free database of commercially available compounds. Generated molecules, stemming from 10 different ML frameworks, along with their calculated properties were merged into a database and coupled to a web interface, which allows users to browse the data in a user friendly and convenient manner. ML-generated molecules with desired structures and properties can be retrieved with the help of a drawing widget. For the case of a specific search leading to insufficient results, users are able to create new molecules on demand. These newly created molecules will be added to the existing database and as a result, the content as well as the diversity of the database keeps growing in line with the user's requirements.

References

  1. J Chem Inf Comput Sci. 2003 May-Jun;43(3):987-1003 [PMID: 12767158]
  2. Nat Rev Drug Discov. 2005 Aug;4(8):649-63 [PMID: 16056391]
  3. Med Res Rev. 1996 Jan;16(1):3-50 [PMID: 8788213]
  4. J Cheminform. 2020 Sep 17;12(1):56 [PMID: 33431035]
  5. Adv Drug Deliv Rev. 2001 Mar 1;46(1-3):3-26 [PMID: 11259830]
  6. Oncotarget. 2017 Feb 14;8(7):10883-10890 [PMID: 28029644]
  7. Nucleic Acids Res. 2019 Jan 8;47(D1):D930-D940 [PMID: 30398643]
  8. J Cheminform. 2020 Nov 10;12(1):68 [PMID: 33292554]
  9. Front Pharmacol. 2020 Dec 18;11:565644 [PMID: 33390943]
  10. ACS Cent Sci. 2018 Feb 28;4(2):268-276 [PMID: 29532027]
  11. Nat Biotechnol. 2019 Sep;37(9):1038-1040 [PMID: 31477924]
  12. Nat Chem Biol. 2010 Dec;6(12):861-3 [PMID: 21079589]
  13. J Comput Aided Mol Des. 2002 Jul;16(7):521-33 [PMID: 12510884]
  14. J Chem Inf Model. 2016 Jun 27;56(6):1132-8 [PMID: 27243272]
  15. J Cheminform. 2018 Jul 11;10(1):31 [PMID: 29995272]
  16. J Am Chem Soc. 2013 May 15;135(19):7296-303 [PMID: 23548177]
  17. Science. 2018 Jul 27;361(6400):360-365 [PMID: 30049875]
  18. Curr Opin Chem Biol. 2006 Jun;10(3):194-202 [PMID: 16675286]
  19. Nature. 2004 Dec 16;432(7019):862-5 [PMID: 15602552]
  20. Nat Methods. 2020 Mar;17(3):261-272 [PMID: 32015543]
  21. J Med Chem. 2014 Apr 24;57(8):3186-204 [PMID: 24151987]
  22. Polymers (Basel). 2020 Jan 08;12(1): [PMID: 31936321]
  23. J Chem Inf Model. 2019 Jan 28;59(1):43-52 [PMID: 30016587]
  24. J Comput Aided Mol Des. 2013 Aug;27(8):675-9 [PMID: 23963658]
  25. J Comput Aided Mol Des. 2020 Jul;34(7):709-715 [PMID: 32468207]
  26. J Chem Inf Model. 2010 May 24;50(5):742-54 [PMID: 20426451]
  27. J Chem Inf Model. 2015 Nov 23;55(11):2324-37 [PMID: 26479676]
  28. Mol Pharm. 2017 Sep 5;14(9):3098-3104 [PMID: 28703000]
  29. ACS Cent Sci. 2018 Jan 24;4(1):120-131 [PMID: 29392184]
  30. Nature. 2009 Jan 8;457(7226):153-4 [PMID: 19129834]
  31. Angew Chem Int Ed Engl. 2014 Jul 28;53(31):8108-12 [PMID: 25044611]

Word Cloud

Created with Highcharts 10.0.0databasemoleculesMLalgorithmscreatepropertiesusersML-generatedSeveralrecentmoleculegenerationutilizedopen-accessvirtualtrainedsamplesZINCfreecommerciallyavailablecompoundsGeneratedstemming10differentframeworksalongcalculatedmergedcoupledwebinterfaceallowsbrowsedatauserfriendlyconvenientmannerdesiredstructurescanretrievedhelpdrawingwidgetcasespecificsearchleadinginsufficientresultsablenewdemandnewlycreatedwilladdedexistingresultcontentwelldiversitykeepsgrowinglineuser'srequirementscheMLio:online

Similar Articles

Cited By