cheML.io: an online database of ML-generated molecules.
Rustam Zhumagambetov, Daniyar Kazbek, Mansur Shakipov, Daulet Maksut, Vsevolod A Peshkov, Siamac Fazli
Author Information
Rustam Zhumagambetov: Department of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University Nur-Sultan Kazakhstan siamac.fazli@nu.edu.kz. ORCID
Daniyar Kazbek: Department of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University Nur-Sultan Kazakhstan siamac.fazli@nu.edu.kz.
Mansur Shakipov: Department of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University Nur-Sultan Kazakhstan siamac.fazli@nu.edu.kz.
Daulet Maksut: Department of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University Nur-Sultan Kazakhstan siamac.fazli@nu.edu.kz.
Vsevolod A Peshkov: Department of Chemistry, School of Sciences and Humanities, Nazarbayev University Nur-Sultan Kazakhstan vsevolod.peshkov@nu.edu.kz. ORCID
Siamac Fazli: Department of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University Nur-Sultan Kazakhstan siamac.fazli@nu.edu.kz. ORCID
Several recent ML algorithms for molecule generation have been utilized to create an open-access database of virtual molecules. The algorithms were trained on samples from ZINC, a free database of commercially available compounds. Generated molecules, stemming from 10 different ML frameworks, along with their calculated properties were merged into a database and coupled to a web interface, which allows users to browse the data in a user friendly and convenient manner. ML-generated molecules with desired structures and properties can be retrieved with the help of a drawing widget. For the case of a specific search leading to insufficient results, users are able to create new molecules on demand. These newly created molecules will be added to the existing database and as a result, the content as well as the diversity of the database keeps growing in line with the user's requirements.