Hierarchical Molecular Graph Self-Supervised Learning for property prediction.

Xuan Zang, Xianbing Zhao, Buzhou Tang
Author Information
  1. Xuan Zang: Department of Computer Science, Harbin Institute of Technology, 518055, Shenzhen, China. ORCID
  2. Xianbing Zhao: Department of Computer Science, Harbin Institute of Technology, 518055, Shenzhen, China.
  3. Buzhou Tang: Department of Computer Science, Harbin Institute of Technology, 518055, Shenzhen, China. tangbuzhou@gmail.com.

Abstract

Molecular graph representation learning has shown considerable strength in molecular analysis and drug discovery. Due to the difficulty of obtaining molecular property labels, pre-training models based on self-supervised learning has become increasingly popular in molecular representation learning. Notably, Graph Neural Networks (GNN) are employed as the backbones to encode implicit representations of molecules in most existing works. However, vanilla GNN encoders ignore chemical structural information and functions implied in molecular motifs, and obtaining the graph-level representation via the READOUT function hinders the interaction of graph and node representations. In this paper, we propose Hierarchical Molecular Graph Self-supervised Learning (HiMol), which introduces a pre-training framework to learn molecule representation for property prediction. First, we present a Hierarchical Molecular Graph Neural Network (HMGNN), which encodes motif structure and extracts node-motif-graph hierarchical molecular representations. Then, we introduce Multi-level Self-supervised Pre-training (MSP), in which corresponding multi-level generative and predictive tasks are designed as self-supervised signals of HiMol model. Finally, superior molecular property prediction results on both classification and regression tasks demonstrate the effectiveness of HiMol. Moreover, the visualization performance in the downstream dataset shows that the molecule representations learned by HiMol can capture chemical semantic information and properties.

References

  1. Nat Rev Drug Discov. 2019 Jun;18(6):463-477 [PMID: 30976107]
  2. Nature. 2015 May 28;521(7553):436-44 [PMID: 26017442]
  3. IEEE Trans Neural Netw Learn Syst. 2022 Feb 16;PP: [PMID: 35171779]
  4. KDD. 2021 Aug;2021:3585-3594 [PMID: 35571558]
  5. Nat Commun. 2022 Feb 21;13(1):973 [PMID: 35190542]
  6. Neural Comput. 1997 Nov 15;9(8):1735-80 [PMID: 9377276]
  7. Chem Sci. 2017 Oct 31;9(2):513-530 [PMID: 29629118]
  8. Sci Bull (Beijing). 2022 May 15;67(9):899-902 [PMID: 36546021]
  9. ChemMedChem. 2008 Oct;3(10):1503-7 [PMID: 18792903]
  10. Nat Mach Intell. 2021 Dec;3(12):1040-1049 [PMID: 35187404]
  11. J Chem Inf Model. 2015 Nov 23;55(11):2324-37 [PMID: 26479676]
  12. Molecules. 2021 May 24;26(11): [PMID: 34073745]
  13. J Chem Inf Model. 2019 Feb 25;59(2):914-923 [PMID: 30669836]
  14. J Med Chem. 2021 Oct 14;64(19):14011-14027 [PMID: 34533311]
  15. J Chem Inf Model. 2022 Jun 13;62(11):2713-2725 [PMID: 35638560]

Grants

  1. 62276082/National Natural Science Foundation of China (National Science Foundation of China)
  2. 61876052/National Natural Science Foundation of China (National Science Foundation of China)

Word Cloud

Created with Highcharts 10.0.0molecularMolecularrepresentationpropertyGraphrepresentationsHiMollearningHierarchicalpredictiongraphobtainingpre-trainingself-supervisedNeuralGNNchemicalinformationSelf-supervisedLearningmoleculetasksshownconsiderablestrengthanalysisdrugdiscoveryDuedifficultylabelsmodelsbasedbecomeincreasinglypopularNotablyNetworksemployedbackbonesencodeimplicitmoleculesexistingworksHowevervanillaencodersignorestructuralfunctionsimpliedmotifsgraph-levelviaREADOUTfunctionhindersinteractionnodepaperproposeintroducesframeworklearnFirstpresentNetworkHMGNNencodesmotifstructureextractsnode-motif-graphhierarchicalintroduceMulti-levelPre-trainingMSPcorrespondingmulti-levelgenerativepredictivedesignedsignalsmodelFinallysuperiorresultsclassificationregressiondemonstrateeffectivenessMoreovervisualizationperformancedownstreamdatasetshowslearnedcancapturesemanticpropertiesSelf-Supervised

Similar Articles

Cited By