k- Strong Inference Algorithm: A Hybrid Information Theory Based Gene Network Inference Algorithm.

Mustafa Özgür Cingiz
Author Information
  1. Mustafa Özgür Cingiz: Computer Engineering Department, Faculty of Engineering and Natural Sciences, Bursa Technical University, Mimar Sinan Campus, Yildirim, 16310, Bursa, Turkey. mustafa.cingiz@btu.edu.tr. ORCID

Abstract

Gene networks allow researchers to understand the underlying mechanisms between diseases and genes while reducing the need for wet lab experiments. Numerous gene network inference (GNI) algorithms have been presented in the literature to infer accurate gene networks. We proposed a hybrid GNI algorithm, k-Strong Inference Algorithm (ksia), to infer more reliable and robust gene networks from omics datasets. To increase reliability, ksia integrates Pearson correlation coefficient (PCC) and Spearman rank correlation coefficient (SCC) scores to determine mutual information scores between molecules to increase diversity of relation predictions. To infer a more robust gene network, ksia applies three different elimination steps to remove redundant and spurious relations between genes. The performance of ksia was evaluated on microbe microarrays database in the overlap analysis with other GNI algorithms, namely ARACNE, C3NET, CLR, and MRNET. Ksia inferred less number of relations due to its strict elimination steps. However, ksia generally performed better on Escherichia coli (E.coli) and Saccharomyces cerevisiae (yeast) gene expression datasets due to F- measure and precision values. The integration of association estimator scores and three elimination stages slightly increases the performance of ksia based gene networks. Users can access ksia R package and user manual of package via https://github.com/ozgurcingiz/ksia .

Keywords

References

  1. Cheng, L., et al. (2019). Computational methods for identifying similar diseases. Molecular Therapy-Nucleic Acid, 18, 590–604. [DOI: 10.1016/j.omtn.2019.09.019]
  2. Szczepińska, T., & Pawłowski, K. (2013). Genomic positions of co-expressed genes: Echoes of chromosome organisation in gene expression data. BMC Research Notes, 6(1), 1–13. [DOI: 10.1186/1756-0500-6-229]
  3. Serin, E. A. R., et al. (2016). Learning from co-expression networks: possibilities and challenges. Frontiers in Plant Science, 7, 444. [PMID: 27092161]
  4. Zakeri, S., Sadat, N., Pashazadeh, S., & MotieGhader, H. (2021). Drug repurposing for Alzheimer’s disease based on protein-protein interaction network. BioMed Research International. https://doi.org/10.1155/2021/1280237 [DOI: 10.1155/2021/1280237]
  5. Kan, K.-J., et al. (2021). Weighted gene co-expression network analysis reveals key genes and potential drugs in abdominal aortic aneurysm. Biomedicines, 95, 546. [DOI: 10.3390/biomedicines9050546]
  6. Palliyil, S., Munro, C., & Porter, A. (2023). Pre-clinical development of human monoclonal antibodies targeting novel, cell wall proteins in drug resistant fungal pathogens. International Journal of Infectious Diseases, 130, S12–S13. [DOI: 10.1016/j.ijid.2023.04.034]
  7. van der Putten, P., et al. (2007). Classification of yeast cells from image features to evaluate pathogen conditions. Multimedia Content Access: Algorithms and Systems., 6506, 177–190.
  8. Yousefi, S. R., et al. (2021). "Synthesis, characterization and application of Co/Co3O4 nanocomposites as an effective photocatalyst for discoloration of organic dye contaminants in wastewater and antibacterial properties. Journal of Molecular Liquids, 337, 116405. [DOI: 10.1016/j.molliq.2021.116405]
  9. Shu, M., et al. (2020). Biosynthesis and antibacterial activity of silver nanoparticles using yeast extract as reducing and capping agents. Nanoscale Research Letters, 15, 1–9. [DOI: 10.1186/s11671-019-3244-z]
  10. Berger, P., et al. (2019). Carriage of Shiga toxin phage profoundly affects Escherichia coli gene expression and carbon source utilization. BMC Genomics, 20, 1–14. [DOI: 10.1186/s12864-019-5892-x]
  11. Stanton, A., et al. (2020). Topical estrogen treatment augments the vaginal response to Escherichia coli flagellin. Scientific Reports, 101, 8473. [DOI: 10.1038/s41598-020-64291-y]
  12. Marton, M. J., et al. (1998). Drug target validation and identification of secondary drug target effects using DNA microarrays.". Nature Medicine, 411, 1293–1301. [DOI: 10.1038/3282]
  13. Cámara, E., et al. (2022). Data mining of Saccharomyces cerevisiae mutants engineered for increased tolerance towards inhibitors in lignocellulosic hydrolysates. Biotechnology Advances, 57, 107947. [PMID: 35314324]
  14. Chen, Y., et al. (2020). Differential scaling of gene expression with cell size may explain size control in budding yeast. Molecular cell, 782, 359–370. [DOI: 10.1016/j.molcel.2020.03.012]
  15. Brazas, M. D., & Hancock, R. E. W. (2005). Using microarray gene signatures to elucidate mechanisms of antibiotic action and resistance. Drug Discovery Today, 1018, 1245–1252. [DOI: 10.1016/S1359-6446(05)03566-X]
  16. Farha, M. A., French, S., & Brown, E. D. (2021). Systems-level chemical biology to accelerate antibiotic drug discovery. Accounts of Chemical Research, 54(8), 1909–1920. [PMID: 33787225]
  17. Hudson, M. A., & Lockless, S. W. (2022). Elucidating the mechanisms of action of antimicrobial agents. MBio, 133, e02240-e2321.
  18. Kanehisa, M., et al. (2023). KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Research, 51, D587–D592. [PMID: 36300620]
  19. Belyaeva, A., Squires, C., & Uhler, C. (2021). DCI: Learning causal differences between gene regulatory networks. Bioinformatics, 37(18), 3067–3069. [PMID: 33704425]
  20. Xie, J., et al. (2020). DNF: a differential network flow method to identify rewiring drivers for gene regulatory networks. Neurocomputing, 410, 202–210. [PMID: 34025035]
  21. Reimand, J., et al. (2019). Pathway enrichment analysis and visualization of omics data using g: Profiler, GSEA, Cytoscape and EnrichmentMap. Nature Protocols, 142, 482–517. [DOI: 10.1038/s41596-018-0103-9]
  22. Singh, A. J., et al. (2018). Differential gene regulatory networks in development and disease. Cellular and Molecular Life Sciences, 75, 1013–1025. [PMID: 29018868]
  23. Fotis, C., et al. (2018). Network-based technologies for early drug discovery. Drug Discovery Today, 233, 626–635. [DOI: 10.1016/j.drudis.2017.12.001]
  24. Mercatelli, D., et al. (2020). "Gene regulatory network inference resources A practical overview. Biochimica et Biophysica Acta (BBA)-Gene Regulatory Mechanisms, 18636, 194430. [DOI: 10.1016/j.bbagrm.2019.194430]
  25. Delgado, F. M., & Gómez-Vela, F. (2019). Computational methods for gene regulatory networks reconstruction and analysis: A review. Artificial intelligence in medicine, 95, 133–145. [PMID: 30420244]
  26. Saint-Antoine, M. M., & Singh, A. (2020). Network inference in systems biology: Recent developments, challenges, and applications. Current Opinion in Biotechnology, 63, 89–98. [PMID: 31927423]
  27. Cao, J., Qi, X., & Zhao, H. (2012). Modeling gene regulation networks using ordinary differential equations Next generation microarray bioinformatics. Humana Press, 802, 185–197.
  28. Smith, V. A., et al. (2006). Computational inference of neural information flow networks. PLoS Computational Biology, 211, e161. [DOI: 10.1371/journal.pcbi.0020161]
  29. Balov, N. (2013). Consistent model selection of discrete Bayesian networks from incomplete data. Electronic Journal of Statistics, 7, 1047–1077. [DOI: 10.1214/13-EJS802]
  30. Opgen-Rhein, R., & Strimmer, K. (2007). From correlation to causation networks: A simple approximate learning algorithm and its application to high-dimensional plant gene expression data. BMC Systems Biology, 1(1), 1–10. [DOI: 10.1186/1752-0509-1-37]
  31. Madar, Aviv, et al. "The Inferelator 2.0: a scalable framework for reconstruction of dynamic regulatory network models." 2009 annual international conference of the ieee engineering in medicine and biology society. IEEE, 2009.
  32. Bansal, M., Gatta, G. D., & Di Bernardo, D. (2006). Inference of gene regulatory networks and compound mode of action from time course gene expression profiles. Bioinformatics, 227, 815–822. [DOI: 10.1093/bioinformatics/btl003]
  33. Matsumoto, H., et al. (2017). SCODE: an efficient regulatory network inference algorithm from single-cell RNA-Seq during differentiation. Bioinformatics, 3315, 2314–2321. [DOI: 10.1093/bioinformatics/btx194]
  34. Kurt, Z., Aydin, N., & Altay, G. (2014). A comprehensive comparison of association estimators for gene network inference algorithms. Bioinformatics, 30(15), 2142–2149. [PMID: 24728859]
  35. Kurt, Z., Aydin, N., & Altay, G. (2016). Comprehensive review of association estimators for the inference of gene networks. Turkish Journal of Electrical Engineering and Computer Sciences, 24(3), 695–718. [DOI: 10.3906/elk-1312-90]
  36. Olsen, C., Meyer, P. E., & Bontempi, G. (2009). On the impact of entropy estimation on transcriptional regulatory network inference based on mutual information. EURASIP Journal on Bioinformatics and Systems Biology, 2008, 1–9. [DOI: 10.1155/2009/308959]
  37. Butte, A. J., & Kohane, I. S. (2003). Relevance networks: A first step toward finding genetic regulatory networks within microarray data (pp. 428–446). The analysis of gene expression data. Springer.
  38. Butte, A. J., & Kohane, I. S. (1999). Mutual information relevance networks: Functional genomic clustering using pairwise entropy measurements. Biocomputing, 2000, 418–429.
  39. Faith, J. J., et al. (2007). Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS biology, 51, e8. [DOI: 10.1371/journal.pbio.0050008]
  40. Meyer, P. E., et al. (2007). Information-theoretic inference of large transcriptional regulatory networks. EURASIP Journal on Bioinformatics and Systems Biology, 2007, 1–9. [DOI: 10.1155/2007/79879]
  41. Margolin, Adam A., et al. 2006 ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context." BMC bioinformatics. Vol. 7. No. 1. BioMed Central
  42. Chan, T. E., Stumpf, M. P. H., & Babtie, A. C. (2017). Gene regulatory network inference from single-cell data using multivariate information measures. Cell Systems, 53, 251–267. [DOI: 10.1016/j.cels.2017.08.014]
  43. Altay, G., & Emmert-Streib, F. (2010). Inferring the conservative causal core of gene regulatory networks. BMC systems biology, 4(1), 1–13. [DOI: 10.1186/1752-0509-4-132]
  44. De Matos Simoes, R., & Emmert-Streib, F. (2012). Bagging statistical network inference from large-scale gene expression data. PLoS ONE, 73, e33624. [DOI: 10.1371/journal.pone.0033624]
  45. Erdoğan, C., Kurt, Z., & Diri, B. (2017). Estimation of the proteomic cancer co-expression sub networks by using association estimators. PLoS ONE, 12(11), e0188016. [PMID: 29145449]
  46. Usadel, B., et al. (2009). Co-expression tools for plant biology: opportunities for hypothesis generation and caveats. Plant, Cell & Environment, 3212, 1633–1651. [DOI: 10.1111/j.1365-3040.2009.02040.x]
  47. Cao, D., et al. (2022). Construction of a pearson-and MIC-based co-expression network to identify potential cancer genes. Interdisciplinary Sciences: Computational Life Sciences, 141, 245–257.
  48. Cingiz, M. Ö., Biricik, G., & Diri, B. (2021). The performance comparison of gene co-expression networks of breast and prostate cancer using different selection criteria. Interdisciplinary Sciences: Computational Life Sciences, 133, 500–510.
  49. Chen, Yu., et al. (2022). Gene co-expression network analysis reveals the positive impact of endocytosis and mitochondria-related genes over nitrogen metabolism in Saccharomyces cerevisiae. Gene, 821, 146267. [PMID: 35150821]
  50. Yang, F., et al. (2022). Identification of key genes associated with papillary thyroid microcarcinoma characteristics by integrating transcriptome sequencing and weighted gene co-expression network analysis. Gene, 811, 146086. [PMID: 34856364]
  51. den Bulcke, V., Tim, et al. (2006). SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC Bioinformatics, 71, 1–12.
  52. Meyer, P. E., Lafitte, F., & Bontempi, G. (2008). minet: AR/Bioconductor package for inferring large transcriptional networks using mutual information. BMC Bioinformatics, 9(1), 1–10. [DOI: 10.1186/1471-2105-9-461]
  53. Faith, J. J., et al. (2007). "Many microbe microarrays database: uniformly normalized affymetrix compendia with structured experimental metadata. Nucleic Acids Research, 36, D866–D870. [PMID: 17932051]
  54. Gama-Castro, S., et al. (2016). RegulonDB version 90: high-level integration of gene regulation, coexpression, motif clustering and beyond. Nucleic Acids Research, 44, D133–D143. [PMID: 26527724]
  55. Kim, H., et al. (2014). YeastNet v3: a public database of data-specific and integrated functional gene networks for Saccharomyces cerevisiae. Nucleic Acids Research, 42, D731–D736. [PMID: 24165882]
  56. Ballouz, S., Verleyen, W., & Gillis, J. (2015). Guidance for RNA-seq co-expression network construction and analysis: Safety in numbers. Bioinformatics, 31(13), 2123–2130. [PMID: 25717192]
  57. Simoes, M., De, R., et al. (2015). Urothelial cancer gene regulatory networks inferred from large-scale RNAseq, Bead and Oligo gene expression data. BMC Systems Biology, 91, 1–19.
  58. Cingiz, M. Ö., & Diri, B. (2019). Two-tier combinatorial structure to integrate various gene co-expression networks of prostate cancer. Gene, 721, 144102. [PMID: 31499125]
  59. Li, J., et al. (2020). MuscNet, a weighted voting model of multi-source connectivity networks to predict mild cognitive impairment using resting-state functional MRI. IEEE Access, 8, 174023–174031. [PMID: 35548102]
  60. Liesecke, F., et al. (2018). Ranking genome-wide correlation measurements improves microarray and RNA-seq based global and targeted co-expression networks. Scientific Reports, 81, 1–16.
  61. Ficklin, S. P., et al. (2017). Discovering condition-specific gene co-expression patterns using gaussian mixture models: a cancer case study. Scientific Reports, 71, 1–11.
  62. Cassan, O., Lèbre, S., & Martin, A. (2021). Inferring and analyzing gene regulatory networks from multi-factorial expression data: A complete and interactive suite. BMC Genomics, 22(1), 387. [PMID: 34039282]
  63. Trinh, H.-C., & Kwon, Y.-K. (2021). A novel constrained genetic algorithm-based Boolean network inference method from steady-state gene expression data.". Bioinformatics, 37, 383–391. [DOI: 10.1093/bioinformatics/btab295]
  64. Huynh-Thu, V. A., et al. (2010). Inferring regulatory networks from expression data using tree-based methods. PLoS ONE, 59, e12776. [DOI: 10.1371/journal.pone.0012776]

MeSH Term

Algorithms
Gene Regulatory Networks
Saccharomyces cerevisiae
Escherichia coli
Information Theory
Computational Biology
Gene Expression Profiling
Databases, Genetic
Software

Word Cloud

Created with Highcharts 10.0.0ksianetworksgeneGenenetworkGNIalgorithmsinferInferencescoreseliminationgenesinferenceAlgorithmrobustdatasetsincreasecorrelationcoefficientthreestepsrelationsperformanceanalysisduecolipackageallowresearchersunderstandunderlyingmechanismsdiseasesreducingneedwetlabexperimentsNumerouspresentedliteratureaccurateproposedhybridalgorithmk-StrongreliableomicsreliabilityintegratesPearsonPCCSpearmanrankSCCdeterminemutualinformationmoleculesdiversityrelationpredictionsappliesdifferentremoveredundantspuriousevaluatedmicrobemicroarraysdatabaseoverlapnamelyARACNEC3NETCLRMRNETKsiainferredlessnumberstrictHowevergenerallyperformedbetterEscherichiaESaccharomycescerevisiaeyeastexpressionF-measureprecisionvaluesintegrationassociationestimatorstagesslightlyincreasesbasedUserscanaccessRusermanualviahttps://githubcom/ozgurcingiz/ksiak-StrongAlgorithm:HybridInformationTheoryBasedNetworkAssociationestimatorsco-expressionregulatoryOverlap

Similar Articles

Cited By