Identifying Candidate Gene-Disease Associations via Graph Neural Networks.

Pietro Cinaglia, Mario Cannataro
Author Information
  1. Pietro Cinaglia: Department of Health Sciences, Magna Graecia University of Catanzaro, 88100 Catanzaro, Italy. ORCID
  2. Mario Cannataro: Data Analytics Research Center, Department of Medical and Surgical Sciences, Magna Graecia University of Catanzaro, 88100 Catanzaro, Italy. ORCID

Abstract

Real-world objects are usually defined in terms of their own relationships or connections. A graph (or network) naturally expresses this model though nodes and edges. In biology, depending on what the nodes and edges represent, we may classify several types of networks, gene-disease associations (GDAs) included. In this paper, we presented a solution based on a graph neural network (GNN) for the identification of candidate GDAs. We trained our model with an initial set of well-known and curated inter- and intra-relationships between genes and diseases. It was based on graph convolutions, making use of multiple convolutional layers and a point-wise non-linearity function following each layer. The embeddings were computed for the input network built on a set of GDAs to map each node into a vector of real numbers in a multidimensional space. Results showed an AUC of 95% for training, validation, and testing, that in the real case translated into a positive response for 93% of the Top-15 (highest dot product) candidate GDAs identified by our solution. The experimentation was conducted on the DisGeNET dataset, while the DiseaseGene Association Miner (DG-AssocMiner) dataset by Stanford's BioSNAP was also processed for performance evaluation only.

Keywords

References

  1. BMC Bioinformatics. 2020 Apr 23;21(Suppl 3):94 [PMID: 32321421]
  2. Brief Bioinform. 2022 Mar 10;23(2): [PMID: 35018408]
  3. Brain. 2008 Feb;131(Pt 2):381-8 [PMID: 18156153]
  4. Entropy (Basel). 2022 Jul 04;24(7): [PMID: 35885152]
  5. Brief Bioinform. 2022 May 13;23(3): [PMID: 35275993]
  6. F1000Res. 2017 Apr 26;6:578 [PMID: 28529714]
  7. Bioinformatics. 2016 Sep 15;32(18):2883-5 [PMID: 27256315]
  8. Nucleic Acids Res. 2017 Jan 4;45(D1):D833-D839 [PMID: 27924018]
  9. Korean J Anesthesiol. 2022 Feb;75(1):25-36 [PMID: 35124947]
  10. J Biomed Inform. 2022 Jul;131:104098 [PMID: 35636720]
  11. Brief Bioinform. 2022 Jan 17;23(1): [PMID: 34889446]
  12. Brief Bioinform. 2022 Jul 18;23(4): [PMID: 35696650]
  13. IEEE/ACM Trans Comput Biol Bioinform. 2021 Mar-Apr;18(2):512-524 [PMID: 31226082]
  14. Bioinformatics. 2018 Dec 15;34(24):4256-4265 [PMID: 29939227]
  15. Electron J Stat. 2015;9(1):1583-1607 [PMID: 26279737]
  16. Comput Struct Biotechnol J. 2021 May 11;19:2960-2967 [PMID: 34136095]
  17. Cancer. 1950 Jan;3(1):32-5 [PMID: 15405679]
  18. Nucleic Acids Res. 2023 Jan 6;51(D1):D523-D531 [PMID: 36408920]
  19. Brief Bioinform. 2022 Jan 17;23(1): [PMID: 34875683]
  20. IEEE Trans Neural Netw Learn Syst. 2021 Jan;32(1):4-24 [PMID: 32217482]
  21. IEEE Trans Neural Netw Learn Syst. 2022 Jun 20;PP: [PMID: 35724277]
  22. Brief Bioinform. 2022 Sep 20;23(5): [PMID: 36056743]
  23. Nucleic Acids Res. 2000 Jan 1;28(1):352-5 [PMID: 10592272]
  24. IEEE Trans Neural Netw Learn Syst. 2022 Dec;33(12):6999-7019 [PMID: 34111009]
  25. IEEE Trans Pattern Anal Mach Intell. 2022 Jul;44(7):3496-3507 [PMID: 33497331]
  26. Stat Med. 2021 Nov 10;40(25):5547-5564 [PMID: 34258781]
  27. Int J Immunogenet. 2014 Dec;41(6):493-8 [PMID: 25256363]
  28. Database (Oxford). 2015 Apr 15;2015:bav028 [PMID: 25877637]
  29. PLoS Comput Biol. 2018 Aug 24;14(8):e1006418 [PMID: 30142158]
  30. Chem Asian J. 2022 Aug 15;17(16):e202200269 [PMID: 35678087]
  31. Int J Approx Reason. 2008 Jan;47(1):17-36 [PMID: 19079753]
  32. Methods. 2022 Feb;198:88-95 [PMID: 34700014]
  33. Brief Bioinform. 2019 Mar 22;20(2):515-539 [PMID: 29045685]
  34. Big Data. 2020 Oct;8(5):379-390 [PMID: 32783631]
  35. Curr Opin Struct Biol. 2022 Apr;73:102327 [PMID: 35074533]

Word Cloud

Created with Highcharts 10.0.0networkgraphGDAsneuralmodelnodesedgesassociationssolutionbasedcandidatesetrealdatasetReal-worldobjectsusuallydefinedtermsrelationshipsconnectionsnaturallyexpressesthoughbiologydependingrepresentmayclassifyseveraltypesnetworksgene-diseaseincludedpaperpresentedGNNidentificationtrainedinitialwell-knowncuratedinter-intra-relationshipsgenesdiseasesconvolutionsmakingusemultipleconvolutionallayerspoint-wisenon-linearityfunctionfollowinglayerembeddingscomputedinputbuiltmapnodevectornumbersmultidimensionalspaceResultsshowedAUC95%trainingvalidationtestingcasetranslatedpositiveresponse93%Top-15highestdotproductidentifiedexperimentationconductedDisGeNETDiseaseGeneAssociationMinerDG-AssocMinerStanford'sBioSNAPalsoprocessedperformanceevaluationonlyIdentifyingCandidateGene-DiseaseAssociationsviaGraphNeuralNetworksdeeplearninggenediseaselinkprediction

Similar Articles

Cited By