Probabilistic Graphical Models Applied to Biological Networks.
Natalia Faraj Murad, Marcelo Mendes Brandão
Author Information
Natalia Faraj Murad: Center for Molecular Biology and Genetic Engineering, State University of Campinas, Campinas, São Paulo, Brazil.
Marcelo Mendes Brandão: Center for Molecular Biology and Genetic Engineering, State University of Campinas, Campinas, São Paulo, Brazil. brandaom@unicamp.br.
Biological networks can be defined as a set of molecules and all the interactions among them. Their study can be useful to predict gene function, phenotypes, and regulate molecular patterns. Probabilistic graphical models (PGMs) are being widely used to integrate different data sources with modeled biological networks. The inference of these models applied to large-scale experiments of molecular biology allows us to predict influences of the experimental treatments in the behavior/phenotype of organisms. Here, we introduce the main types of PGMs and their applications in a biological networks context.
Agarwala R, Barrett T, Beck J, Benson DA, Bollin C, Bolton E et al (2016) Database resources of the national center for biotechnology information. Nucleic Acids Res 44(D1):D7–D19
Aguilera PA, Fernández A, Fernández R, Rumí R, Salmerón A (2011) Bayesian networks in environmental modelling. Environ Model Softw 26(12):1376–1388. https://doi.org/10.1016/j.envsoft.2011.06.004
[DOI: 10.1016/j.envsoft.2011.06.004]
Albert R (2005) Scale-free networks in cell biology. J Cell Sci 118(21):4947–4957. https://doi.org/10.1242/jcs.02714
[DOI: 10.1242/jcs.02714]
Alm E, Arkin AP (2003) Biological networks. Curr Opin Struct Biol 13(2):193–202
[PMID: 12727512]
Alon U (2003) Biological network: the tinkerer as an engineer. Science (80- ) 301(September):1866–1867
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM et al (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25(1):25–29
[PMID: 10802651]
Ballouz S, Verleyen W, Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis: safety in numbers. Bioinformatics 31(13):2123–2130
[PMID: 25717192]
Bansal M, Belcastro V, Ambesi-Impiombato A, di Bernardo D (2007) How to infer gene networks from expression profiles. Mol Syst Biol 3(78):78. http://www.ncbi.nlm.nih.gov/pubmed/17299415
[PMID: 17299415]
Barabasi A, Oltvai ZN (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5(2):101–113. http://www.ncbi.nlm.nih.gov/pubmed/14735121
[PMID: 14735121]
Bastian M, Heymann S, Jacomy M (2009) Gephi. An open source software for exploring and manipulating networks. In: Third Int AAAI Conf Weblogs Soc Media, pp 361–362. http://www.aaai.org/ocs/index.php/ICWSM/09/paper/view/154%5Cnpapers2://publication/uuid/CCEBC82E-0D18-4FFC-91EC-6E4A7F1A1972
Batagelj V, Marver A (1998) Pajek – a program for large network analysis. Connections 21:47–57. http://vlado.fmf.uni-lj.si/pub/networks/doc/pajek.pdf
Berkan Sesen M, Nicholson AE, Banares-Alcantara R, Kadir T, Brady M (2013) Bayesian networks for clinical decision support in lung cancer care. PLoS One 8(12):1–13
Bollobas B (1984) The evolution of random graphs. Trans Am Math Soc 286(1):257. http://www.jstor.org/stable/1999405?origin=crossref
Bøttcher SG, Dethlefsen C (2003a) Learning Bayesian networks with R. DSC 2003 working paper
Bøttcher SG, Dethlefsen C (2003b) deal: a package for learning Bayesian networks. J Stat Softw 8(20):1–40. http://www.jstatsoft.org/v08/i20/paper
Bullmore E, Sporns O (2009) Complex brain networks: graph theoretical analysis of structural and functional systems. Nat Rev Neurosci 10(3):186–198
[PMID: 19190637]
Buntine W (1996) A guide to the literature on learning probabilistic networks from data. IEEE Trans Knowl Data Eng 8(2):195–210
Chen T, Filkov V, Skiena SS (2001) Identifying gene regulatory networks from experimental data. Parallel Comput 27(1–2):141–162
Chen X, Chen M, Ning K (2006) BNArray: an R package for constructing gene regulatory networks from microarray data by using Bayesian network. Bioinformatics 22(23):2952–2954
[PMID: 17005537]
Costa LF, Rodrigues FA, Cristino AS (2008) Complex networks: the key to systems biology. Genet Mol Biol 31(3):591–601
Csárdi G, Nepusz T (2006) The igraph software package for complex network research. Int J Complex Syst 1695:1–9
Dobra A, Hans C, Jones B, Nevins JR, Yao G, West M (2004) Sparse graphical models for exploring gene expression data. J Multivar Anal 90(1 Spec Issue):196–212
Dojer N, Bednarz P, Podsiadło A, Wilczyński B (2013) BNFinder2: faster Bayesian network learning and Bayesian classification. Bioinformatics 29(16):2068–2070
[PMID: 23818512]
Friedman N (2004) Inferring cellular networks using probabilistic graphical models. Science (80- ) 303(5659):799–805
Friedman N, Linial M, Nachman I, Pe’er D (2000a) Using Bayesian networks to analyze expression data. J Comput Biol 7:127–135. http://dl.acm.org/citation.cfm?id=332306.332355
Friedman N, Linial M, Nachman I, Pe’er D (2000b) Using Bayesian networks to analyze expression data. J Comput Biol 7(3–4):601–620
[PMID: 11108481]
Gagnon-Bartsch JA, Speed TP (2012) Using control genes to correct for unwanted variation in microarray data. Biostatistics 13(3):539–552
[PMID: 22101192]
Garroway CJ, Bowman J, Carr D, Wilson PJ (2008) Applications of graph theory to landscape genetics. Evol Appl 1:620–630. https://doi.org/10.1111/j.1752-4571.2008.00047.x
[DOI: 10.1111/j.1752-4571.2008.00047.x]
Giraud C, Huet S, Verzelen N (2012) Graph selection with GGMselect. Stat Appl Genet Mol Biol 11(3):3
Gu Z, Gu L, Eils R, Schlesner M, Brors B (2014) Circlize implements and enhances circular visualization in R. Bioinformatics 30(19):2811–2812
[PMID: 24930139]
Guldener U (2006) MPact: the MIPS protein interaction resource on yeast. Nucleic Acids Res 34(90001):D436–D441. https://doi.org/10.1093/nar/gkj003
[DOI: 10.1093/nar/gkj003]
Ha MJ, Carolina N, Sun W, Carolina N (2015) Partial correlation matrix estimation using ridge penalty followed by thresholding and reestimation. Biometrics 70(3):762–770
He Y, Chen ZJ, Evans AC (2007) Small-world anatomical networks in the human brain revealed by cortical thickness from MRI. Cereb Cortex 17(10):2407–2419
[PMID: 17204824]
Heckerman D, Geiger D, Chickering DM (1995) Learning {B}ayesian networks: the combination of knowledge and statistical data. Mach Learn 20(3):197–243. https://doi.org/10.1023/A:1022623210503
[DOI: 10.1023/A]
Højsgaard S (2012) Graphical independence networks with the gRain Package for R. J Stat Softw 46(10):1–26. http://www.jstatsoft.org/index.php/jss/article/view/v046i10/v46i10.pdf
Højsgaard S, Lauritzen SL (2007) Inference in graphical Gaussian models with edge and vertex symmetries with the gRc package for R. J Stat Softw 23(6):1–26
Husmeier D (2003) Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics 19(17):2271–2282
[PMID: 14630656]
Jaynes E (1984) Prior information and ambiguity in inverse problems. In: Inverse problems, vol 14, pp 151–166. http://bayes.wustl.edu/etj/articles/ambiguity.pdf
Joshi-Tope G, Gillespie M, Vastrik I, D’Eustachio P, Schmidt E, de Bono B et al (2005) Reactome: a knowledgebase of biological pathways. Nucleic Acids Res 33(Database Issue):428–432
Kanehisa M, Goto S (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28(1):27–30
[PMID: 10592173]
Karlebach G, Shamir R (2008) Modelling and analysis of gene regulatory networks. Nat Rev Mol Cell Biol 9(10):770–780. http://www.ncbi.nlm.nih.gov/pubmed/18797474
[PMID: 18797474]
Karp PD (2000) The EcoCyc and MetaCyc databases. Nucleic Acids Res 28(1):56–59. https://doi.org/10.1093/nar/28.1.56
[DOI: 10.1093/nar/28.1.56]
Klinke D, Barnett J, Cuff C, et al (2014) Using Bayesian networks to identify control topography between cancer processes and immune responses via metagene constructs. Jacob Kaiser Thesis submitted to the College of Medicine at West Virginia University in partial fulfillment of the requirements
Krämer N, Schäfer J, Boulesteix A-L (2009) Regularized estimation of large-scale gene association networks using graphical Gaussian models. BMC Bioinformatics 10(1):384. https://doi.org/10.1186/1471-2105-10-384
[DOI: 10.1186/1471-2105-10-384]
Lan Z, Zhao Y, Kang J, Yu T (2016) Bayesian network feature finder (BANFF): an R package for gene network feature selection. Bioinformatics 32(23):3685–3687
[PMID: 27503223]
Larrañaga P, Calvo B, Santana R, Bielza C, Galdiano J, Inza I et al (2006) Machine learning in bioinformatics. Brief Bioinform 7(1):86–112
[PMID: 16761367]
Larrañaga P, Karshenas H, Bielza C, Santana R (2012) A review on probabilistic graphical models in evolutionary computation. J Heuristics 18(5):795–819
Leek JT, Johnson WE, Parker HS, Jaffe AE, Storey JD (2012) The SVA package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28(6):882–883
[PMID: 22257669]
Lesne A (2006) Complex networks: from graph theory to biology. Lett Math Phys 78(3):235–262
Ma S, Gong Q, Bohnert HJ (2007) An Arabidopsis gene network based on the graphical Gaussian model. Genome Res 17:1614–1625
[PMID: 17921353]
McCormick AJ, Cramer MD, Watt DA (2008) Differential expression of genes in the leaves of sugarcane in response to sugar accumulation. Trop Plant Biol 1(2):142–158. https://doi.org/10.1007/s12042-008-9013-2
[DOI: 10.1007/s12042-008-9013-2]
Morris JS, Kuchinsky A, Pico A, Institutes G (2012) Analysis and visualization of biological networks with cytoscape. UCSF, p 65. http://www.cgl.ucsf.edu/Outreach/Workshops/NIH-Oct-2012/Cytoscape/Analysis%20and%20Visualization%20of%20Biological%20Networks%20with%20Cytoscape%20v6.pdf
Murad NF (2013) REDES DE REGULAÇÃO GÊNICA DO METABOLISMO DE SACAROSE EM CANA-DE-Açúcar Utilizando Bayesianas, Redes. p 21
Nagarajan R, Scutari M (2013) Impact of noise on molecular network inference. PLoS One 8(12):e80735
[PMID: 24339879]
Okoniewski MJ, Miller CJ (2006) Hybridization interactions between probesets in short oligo microarrays lead to spurious correlations. BMC Bioinformatics 7:1–14
Pavlopoulos GA, Hooper SD, Sifrim A, Schneider R, Medusa AJ (2011) A tool for exploring and clustering biological networks. BMC Res Notes 4(1):384. http://www.biomedcentral.com/1756-0500/4/384
[PMID: 21978489]
Pearl J (1997) Bayesian networks. Tech Rep R-246 (Rev II). In: The MIT encyclopedia of the cognitive sciences, pp 3–6
Rawlings ND, Barrett AJ, Finn R (2016) Twenty years of the MEROPS database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res 44(D1):D343–D350
[PMID: 26527717]
Schäfer J, Strimmer K (2005) Learning large-scale graphical Gaussian models from genomic data. Proc Natl Acad Sci U S A 776:263
Scutari M, Nagarajan R (2011) On identifying significant edges in graphical models of molecular networks. ArXiv. http://arxiv.org/abs/1104.0896
Sebastiani P, Abad M, Ramoni M (2005) Bayesian networks for genomic analysis. In: Genomic signal processing, pp 1–38. http://128.197.153.21/sebas/pdf-papers/gsp.pdf
Shah A, Woolf P (2013) Python environment for Bayesian learning: inferring the structure of Bayesian networks from knowledge and data. J Mach Learn Res 10:159–162
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D et al (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13:2498–2504
[PMID: 14597658]
Sharan R, Ulitsky I, Shamir R (2007) Network-based prediction of protein function. Mol Syst Biol 3(88):1–13
Spirtes P, Glymour C, Scheines R, Kauffman S (2000) Constructing Bayesian network models of gene expression networks from microarray data. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.645.1959&rep=rep1&type=pdf
Stark C (2006) BioGRID: a general repository for interaction datasets. Nucleic Acids Res 34(90001):D535–D539. https://doi.org/10.1093/nar/gkj109
[DOI: 10.1093/nar/gkj109]
Steen HB (1992) Noise, sensitivity, and resolution of flow cytometers. Cytometry 13(8):822–830
[PMID: 1458999]
Su C, Andrew A, Karagas MR, Borsuk ME (2013) Using Bayesian networks to discover relations between genes, environment, and disease. BioData Min 6(1):6. https://doi.org/10.1186/1756-0381-6-6
[DOI: 10.1186/1756-0381-6-6]
Tamada Y, Kim S, Bannai H, Imoto S, Tashiro K, Kuhara S et al (2003) Estimating gene networks from gene expression data by combining Bayesian network model with promoter element detection. Bioinformatics 19(Suppl 2):227–236
Vera-Licona P, Jarrah A, Garcia-Puente LD, McGee J, Laubenbacher R (2014) An algebra-based method for inferring gene regulatory networks. BMC Syst Biol 8:37. http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=4022379&tool=pmcentrez&rendertype=abstract
[PMID: 24669835]
Wang Z, Xu W, Lucas FAS, Liu Y (2013) Incorporating prior knowledge into Gene Network Study. Bioinformatics 29(20):2633–2640
[PMID: 23956306]
Wang T, Ren Z, Ding Y, Fang Z, Sun Z, MacDonald ML et al (2016) FastGGM: an efficient algorithm for the inference of Gaussian graphical model in biological networks. PLoS Comput Biol 12(2):1–16
Welsh IC, Kwak H, Chen FL, Werner M, Shopland LS, Danko CG et al (2015) Chromatin architecture of the Pitx2 locus requires CTCF- and Pitx2-dependent asymmetry that mirrors embryonic gut laterality. Cell Rep 13(2):337–349. https://doi.org/10.1016/j.celrep.2015.08.075
[DOI: 10.1016/j.celrep.2015.08.075]
Werhli AV (2012) Comparing the reconstruction of regulatory pathways with distinct Bayesian networks inference methods. BMC Genomics 13(Suppl 5):S2. http://www.ncbi.nlm.nih.gov/pubmed/23095805
[PMID: 23095805]
Werhli AV, Grzegorczyk M, Husmeier D (2006) Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical Gaussian models and Bayesian networks. Bioinformatics 22(20):2523–2531
[PMID: 16844710]
Wilczyński B, Dojer N (2009) BNFinder: exact and efficient method for learning Bayesian networks. Bioinformatics 25(2):286
[PMID: 18826957]
Wu X, Ye Y, Subramanian KR (2003) Interactive analysis of gene interactions using graphical gaussian model. In: BIOKDD03: 3rd ACM SIGKDD Workshop on Data Mining in Bioinformatics, pp 1–7
Young WC, Raftery AE, Yeung KY (2014) Fast Bayesian inference for gene regulatory networks using ScanBMA. BMC Syst Biol 8(1):47. http://www.ncbi.nlm.nih.gov/pubmed/24742092
[PMID: 24742092]
Yu J, Smith VA, Wang PP, Hartemink AJ, Jarvis ED (2004) Advances to Bayesian network inference for generating causal networks from observational biological data. Bioinformatics 20(18):3594–3603
[PMID: 15284094]