Representing core gene expression activity relationships using the latent structure implicit in Bayesian networks.

Jiahao Gao, Mark Gerstein
Author Information
  1. Jiahao Gao: Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, United States. ORCID
  2. Mark Gerstein: Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, United States. ORCID

Abstract

MOTIVATION: Many types of networks, such as co-expression or ChIP-seq-based gene-regulatory networks, provide useful information for biomedical studies. However, they are often too full of connections and difficult to interpret, forming "indecipherable hairballs."
RESULTS: To address this issue, we propose that a Bayesian network can summarize the core relationships between gene expression activities. This network, which we call the LatentDAG, is substantially simpler than conventional co-expression network and ChIP-seq networks (by two orders of magnitude). It provides clearer clusters, without extraneous cross-cluster connections, and clear separators between modules. Moreover, one can find a number of clear examples showing how it bridges the connection between steps in the transcriptional regulatory network and other networks (e.g. RNA-binding protein). In conjunction with a graph neural network, the LatentDAG works better than other biological networks in a variety of tasks, including prediction of gene conservation and clustering genes.
AVAILABILITY AND IMPLEMENTATION: Code is available at https://github.com/gersteinlab/LatentDAG.

References

  1. J Immunol. 2001 Nov 15;167(10):5970-6 [PMID: 11698476]
  2. Nucleic Acids Res. 2019 Jul 2;47(W1):W191-W198 [PMID: 31066453]
  3. Antioxidants (Basel). 2017 Nov 03;6(4): [PMID: 29099803]
  4. Nat Methods. 2017 Nov;14(11):1083-1086 [PMID: 28991892]
  5. Nat Cancer. 2022 Jan;3(1):75-89 [PMID: 35121990]
  6. Genome Biol. 2021 Oct 7;22(1):287 [PMID: 34620211]
  7. Comput Struct Biotechnol J. 2024 May 11;23:2190-2199 [PMID: 38817966]
  8. J Biol Chem. 2010 Jan 15;285(3):1950-6 [PMID: 19923220]
  9. Nat Cell Biol. 2019 Aug;21(8):978-990 [PMID: 31358969]
  10. iScience. 2020 Nov 20;23(12):101838 [PMID: 33305192]
  11. Cell. 2022 Jul 7;185(14):2559-2575.e28 [PMID: 35688146]
  12. J Proteome Res. 2015 Feb 6;14(2):804-13 [PMID: 25497084]
  13. IEEE Trans Neural Netw Learn Syst. 2021 Jan;32(1):4-24 [PMID: 32217482]
  14. Genome Res. 2005 Aug;15(8):1034-50 [PMID: 16024819]
  15. Nucleic Acids Res. 2006 Jan 1;34(Database issue):D535-9 [PMID: 16381927]
  16. Science. 2020 Sep 11;369(6509):1318-1330 [PMID: 32913098]
  17. Front Immunol. 2017 Apr 28;8:505 [PMID: 28503176]
  18. FEBS Lett. 2000 May 4;473(1):47-52 [PMID: 10802057]
  19. Genomics Proteomics Bioinformatics. 2020 Apr;18(2):120-128 [PMID: 32858223]
  20. FEBS J. 2010 Oct;277(20):4308-21 [PMID: 20849415]
  21. Curr Protoc Bioinformatics. 2019 Dec;68(1):e89 [PMID: 31751002]
  22. Nucleic Acids Res. 2004 Jan 12;32(1):316-27 [PMID: 14718551]
  23. Genome Res. 2003 Nov;13(11):2498-504 [PMID: 14597658]
  24. Nucleic Acids Res. 2021 Feb 22;49(3):e17 [PMID: 33347581]
  25. Int J Cancer. 2009 Aug 1;125(3):639-48 [PMID: 19425054]
  26. Nucleic Acids Res. 2019 Jan 8;47(D1):D529-D541 [PMID: 30476227]
  27. Commun Biol. 2019 Aug 9;2:306 [PMID: 31428694]
  28. Nucleic Acids Res. 2019 Jan 8;47(D1):D607-D613 [PMID: 30476243]
  29. J Mol Biol. 2022 Jun 15;434(11):167530 [PMID: 35662463]
  30. Nucleic Acids Res. 2019 Jan 8;47(D1):D766-D773 [PMID: 30357393]
  31. Bioinformatics. 2011 Dec 1;27(23):3221-7 [PMID: 22039215]
  32. Curr Biol. 2010 Sep 14;20(17):R746-53 [PMID: 20833319]
  33. Genome Biol. 2015 Mar 31;16:63 [PMID: 25880651]
  34. Sci Rep. 2019 Mar 26;9(1):5233 [PMID: 30914743]
  35. Cells. 2021 Jul 26;10(8): [PMID: 34440660]
  36. Proc Natl Acad Sci U S A. 2003 Aug 5;100(16):9440-5 [PMID: 12883005]
  37. J Proteomics. 2013 Aug 26;89:227-37 [PMID: 23665002]
  38. Genome Res. 2010 Jan;20(1):110-21 [PMID: 19858363]
  39. Nucleic Acids Res. 2021 Jan 8;49(D1):D1046-D1057 [PMID: 33221922]

Grants

  1. U24 MH136793/NIMH NIH HHS
  2. 1U24MH136793/NIH HHS

MeSH Term

Bayes Theorem
Gene Regulatory Networks
Humans
Algorithms
Computational Biology
Gene Expression Profiling
Cluster Analysis

Word Cloud

Created with Highcharts 10.0.0networksnetworkgeneco-expressionconnectionsBayesiancancorerelationshipsexpressionLatentDAGclearMOTIVATION:ManytypesChIP-seq-basedgene-regulatoryprovideusefulinformationbiomedicalstudiesHoweveroftenfulldifficultinterpretforming"indecipherablehairballs"RESULTS:addressissueproposesummarizeactivitiescallsubstantiallysimplerconventionalChIP-seqtwoordersmagnitudeprovidesclearerclusterswithoutextraneouscross-clusterseparatorsmodulesMoreoveronefindnumberexamplesshowingbridgesconnectionstepstranscriptionalregulatoryegRNA-bindingproteinconjunctiongraphneuralworksbetterbiologicalvarietytasksincludingpredictionconservationclusteringgenesAVAILABILITYANDIMPLEMENTATION:Codeavailablehttps://githubcom/gersteinlab/LatentDAGRepresentingactivityusinglatentstructureimplicit

Similar Articles

Cited By (1)