GDCL-NcDA: identifying non-coding RNA-disease associations via contrastive learning between deep graph learning and deep matrix factorization.

Ning Ai, Yong Liang, Haoliang Yuan, Dong Ouyang, Shengli Xie, Xiaoying Liu
Author Information
  1. Ning Ai: Peng Cheng Laboratory, Shenzhen, 518005, Guangdong, China.
  2. Yong Liang: Peng Cheng Laboratory, Shenzhen, 518005, Guangdong, China. yongliangresearch@gmail.com.
  3. Haoliang Yuan: School of Automation, Guangdong University of Technology, Guangzhou, 510006, Guangdong, China.
  4. Dong Ouyang: Peng Cheng Laboratory, Shenzhen, 518005, Guangdong, China.
  5. Shengli Xie: Institute of Intelligent Information Processing, Guangdong University of Technology, Guangzhou, 510000, Guangdong, China.
  6. Xiaoying Liu: Computer Engineering Technical College, Guangdong Polytechnic of Science and Technology, Zhuhai, Guangdong, 519090, China.

Abstract

Non-coding RNAs (ncRNAs) draw much attention from studies widely in recent years because they play vital roles in life activities. As a good complement to wet experiment methods, computational prediction methods can greatly save experimental costs. However, high false-negative data and insufficient use of multi-source information can affect the performance of computational prediction methods. Furthermore, many computational methods do not have good robustness and generalization on different datasets. In this work, we propose an effective end-to-end computing framework, called GDCL-NcDA, of deep graph learning and deep matrix factorization (DMF) with contrastive learning, which identifies the latent ncRNA-disease association on diverse multi-source heterogeneous networks (MHNs). The diverse MHNs include different similarity networks and proven associations among ncRNAs (miRNAs, circRNAs, and lncRNAs), genes, and diseases. Firstly, GDCL-NcDA employs deep graph convolutional network and multiple attention mechanisms to adaptively integrate multi-source of MHNs and reconstruct the ncRNA-disease association graph. Then, GDCL-NcDA utilizes DMF to predict the latent disease-associated ncRNAs based on the reconstructed graphs to reduce the impact of the false-negatives from the original associations. Finally, GDCL-NcDA uses contrastive learning (CL) to generate a contrastive loss on the reconstructed graphs and the predicted graphs to improve the generalization and robustness of our GDCL-NcDA framework. The experimental results show that GDCL-NcDA outperforms highly related computational methods. Moreover, case studies demonstrate the effectiveness of GDCL-NcDA in identifying the associations among diversiform ncRNAs and diseases.

Keywords

References

  1. Nat Med. 2008 Jul;14(7):723-30 [PMID: 18587408]
  2. Bioinformatics. 2019 Nov 1;35(21):4364-4371 [PMID: 30977780]
  3. Brief Bioinform. 2022 Sep 20;23(5): [PMID: 36070867]
  4. Front Genet. 2018 Dec 10;9:618 [PMID: 30619454]
  5. IEEE J Biomed Health Inform. 2021 Mar;25(3):891-899 [PMID: 32750925]
  6. Interdiscip Sci. 2021 Dec;13(4):582-594 [PMID: 34185304]
  7. Nucleic Acids Res. 2018 Jan 4;46(D1):D371-D374 [PMID: 29106639]
  8. Nucleic Acids Res. 2019 Jan 8;47(D1):D1013-D1017 [PMID: 30364956]
  9. RNA Biol. 2016;13(1):34-42 [PMID: 26669964]
  10. Bioinformatics. 2009 Jun 1;25(11):1422-3 [PMID: 19304878]
  11. Nucleic Acids Res. 2013 Jan;41(Database issue):D983-6 [PMID: 23175614]
  12. Bull Med Libr Assoc. 2000 Jul;88(3):265-6 [PMID: 10928714]
  13. Comput Biol Chem. 2022 Aug;99:107722 [PMID: 35810557]
  14. Nat Methods. 2015 Aug;12(8):697 [PMID: 26226356]
  15. J Chem Inf Model. 2022 Aug 8;62(15):3676-3684 [PMID: 35838124]
  16. Redox Biol. 2018 May;15:284-296 [PMID: 29304479]
  17. Sci Rep. 2021 Jan 12;11(1):565 [PMID: 33436852]
  18. Sci Rep. 2016 Jun 01;6:27036 [PMID: 27246786]
  19. Nucleic Acids Res. 2021 Jan 8;49(D1):D1251-D1258 [PMID: 33219685]
  20. Bioinformatics. 2020 Apr 15;36(8):2538-2546 [PMID: 31904845]
  21. Bioinformatics. 2018 Jan 15;34(2):267-277 [PMID: 28968753]
  22. Bioinformatics. 2020 Jul 1;36(13):4038-4046 [PMID: 31793982]
  23. Nucleic Acids Res. 2017 Jan 4;45(D1):D833-D839 [PMID: 27924018]
  24. IEEE/ACM Trans Comput Biol Bioinform. 2022 Sep-Oct;19(5):2907-2919 [PMID: 34283719]
  25. Brief Bioinform. 2021 Sep 2;22(5): [PMID: 33415333]
  26. Brief Bioinform. 2020 Jul 15;21(4):1356-1367 [PMID: 31197324]
  27. Int J Mol Sci. 2018 Oct 31;19(11): [PMID: 30384427]
  28. Brief Bioinform. 2022 Jan 17;23(1): [PMID: 34585231]
  29. Cell Death Dis. 2019 Sep 11;10(9):668 [PMID: 31511497]
  30. Brief Bioinform. 2021 Sep 2;22(5): [PMID: 33443536]
  31. BMC Bioinformatics. 2022 Jun 21;23(1):244 [PMID: 35729531]
  32. PLoS Comput Biol. 2021 Jul 12;17(7):e1009165 [PMID: 34252084]
  33. Nucleic Acids Res. 2014 Jan;42(Database issue):D1070-4 [PMID: 24194601]
  34. Database (Oxford). 2018 Jan 1;2018: [PMID: 29741596]
  35. RNA Biol. 2012 Jun;9(6):701-2 [PMID: 22664913]
  36. Brief Bioinform. 2022 Jan 17;23(1): [PMID: 34864877]
  37. Database (Oxford). 2015 Sep 10;2015: [PMID: 26363021]
  38. PLoS Comput Biol. 2018 Aug 24;14(8):e1006418 [PMID: 30142158]
  39. Brief Funct Genomics. 2015 Mar;14(2):115-29 [PMID: 25212482]
  40. Nucleic Acids Res. 2017 Jan 4;45(D1):D812-D818 [PMID: 27899556]
  41. Cell. 2007 Mar 9;128(5):815-8 [PMID: 17350564]
  42. Brief Bioinform. 2021 Nov 5;22(6): [PMID: 33963829]
  43. Cancer Lett. 2018 Apr 1;418:41-50 [PMID: 29330104]
  44. Nucleic Acids Res. 2019 Jan 8;47(D1):D1034-D1037 [PMID: 30285109]
  45. Bioinformatics. 2018 Jul 1;34(13):i457-i466 [PMID: 29949996]
  46. Proc Int Conf Web Search Data Min. 2022 Feb;2022:1300-1309 [PMID: 35647617]
  47. Methods. 2020 Feb 15;173:32-43 [PMID: 31226302]
  48. Comput Biol Med. 2022 Oct;149:106069 [PMID: 36115300]
  49. Bioinformatics. 2018 May 1;34(9):1529-1537 [PMID: 29228285]
  50. Hum Mol Genet. 2006 Apr 15;15 Spec No 1:R17-29 [PMID: 16651366]
  51. Bioinformatics. 2007 May 15;23(10):1274-81 [PMID: 17344234]
  52. Nucleic Acids Res. 2022 Jan 7;50(D1):D83-D92 [PMID: 34530446]
  53. Lab Invest. 2018 Sep;98(9):1133-1142 [PMID: 29967342]
  54. RNA Biol. 2019 Jul;16(7):899-905 [PMID: 31023147]

MeSH Term

Learning
RNA, Untranslated
RNA, Long Noncoding
MicroRNAs
RNA, Circular
Computational Biology

Chemicals

RNA, Untranslated
RNA, Long Noncoding
MicroRNAs
RNA, Circular

Word Cloud

Created with Highcharts 10.0.0GDCL-NcDAlearningmethodsdeepgraphassociationsncRNAscomputationalcontrastivemulti-sourcematrixfactorizationnetworksMHNsgraphsNon-codingattentionstudiesgoodpredictioncanexperimentalrobustnessgeneralizationdifferentframeworkDMFlatentncRNA-diseaseassociationdiverseamongdiseasesreconstructedidentifyingRNA-diseaseDeepRNAsdrawmuchwidelyrecentyearsplayvitalroleslifeactivitiescomplementwetexperimentgreatlysavecostsHoweverhighfalse-negativedatainsufficientuseinformationaffectperformanceFurthermoremanydatasetsworkproposeeffectiveend-to-endcomputingcalledidentifiesheterogeneousincludesimilarityprovenmiRNAscircRNAslncRNAsgenesFirstlyemploysconvolutionalnetworkmultiplemechanismsadaptivelyintegratereconstructutilizespredictdisease-associatedbasedreduceimpactfalse-negativesoriginalFinallyusesCLgeneratelosspredictedimproveresultsshowoutperformshighlyrelatedMoreovercasedemonstrateeffectivenessdiversiformGDCL-NcDA:non-codingviaContrastiveMulti-sourceheterogenous

Similar Articles

Cited By