Semi-Supervised Multi-View Learning for Gene Network Reconstruction.

Michelangelo Ceci, Gianvito Pio, Vladimir Kuzmanovski, Sašo Džeroski
Author Information
  1. Michelangelo Ceci: Dept. of Computer Science, University of Bari Aldo Moro, Via Orabona 4, 70125 Bari, Italy.
  2. Gianvito Pio: Dept. of Computer Science, University of Bari Aldo Moro, Via Orabona 4, 70125 Bari, Italy.
  3. Vladimir Kuzmanovski: Dept. of Knowledge Technologies, Jožef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia.
  4. Sašo Džeroski: Dept. of Knowledge Technologies, Jožef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia.

Abstract

The task of gene regulatory network reconstruction from high-throughput data is receiving increasing attention in recent years. As a consequence, many inference methods for solving this task have been proposed in the literature. It has been recently observed, however, that no single inference method performs optimally across all datasets. It has also been shown that the integration of predictions from multiple inference methods is more robust and shows high performance across diverse datasets. Inspired by this research, in this paper, we propose a machine learning solution which learns to combine predictions from multiple inference methods. While this approach adds additional complexity to the inference process, we expect it would also carry substantial benefits. These would come from the automatic adaptation to patterns on the outputs of individual inference methods, so that it is possible to identify regulatory interactions more reliably when these patterns occur. This article demonstrates the benefits (in terms of accuracy of the reconstructed networks) of the proposed method, which exploits an iterative, semi-supervised ensemble-based algorithm. The algorithm learns to combine the interactions predicted by many different inference methods in the multi-view learning setting. The empirical evaluation of the proposed algorithm on a prokaryotic model organism (E. coli) and on a eukaryotic model organism (S. cerevisiae) clearly shows improved performance over the state of the art methods. The results indicate that gene regulatory network reconstruction for the real datasets is more difficult for S. cerevisiae than for E. coli. The software, all the datasets used in the experiments and all the results are available for download at the following link: http://figshare.com/articles/Semi_supervised_Multi_View_Learning_for_Gene_Network_Reconstruction/1604827.

References

  1. Bioinformatics. 2002;18 Suppl 1:S216-24 [PMID: 12169550]
  2. Biosystems. 2009 Apr;96(1):86-103 [PMID: 19150482]
  3. Ann N Y Acad Sci. 2007 Dec;1115:1-22 [PMID: 17925349]
  4. Nucleic Acids Res. 2011 Jan;39(Database issue):D1005-10 [PMID: 21097893]
  5. Bioinformatics. 2003 Jan 22;19(2):185-93 [PMID: 12538238]
  6. J Cell Sci. 2005 Nov 1;118(Pt 21):4947-57 [PMID: 16254242]
  7. Nat Biotechnol. 2005 Aug;23(8):942-4 [PMID: 16082362]
  8. BMC Syst Biol. 2012;6:145 [PMID: 23173819]
  9. Proteomics. 2006 Jan;6(2):456-61 [PMID: 16317777]
  10. Proc Natl Acad Sci U S A. 2002 Jun 11;99(12):7821-6 [PMID: 12060727]
  11. Proc Natl Acad Sci U S A. 2000 Oct 24;97(22):12182-6 [PMID: 11027309]
  12. Interface Focus. 2011 Dec 6;1(6):857-70 [PMID: 23226586]
  13. Nat Protoc. 2009;4(3):393-411 [PMID: 19265799]
  14. Proc Natl Acad Sci U S A. 2007 Sep 25;104(39):15224-9 [PMID: 17881571]
  15. Nat Rev Genet. 2009 Oct;10(10):669-80 [PMID: 19736561]
  16. Nature. 2000 Oct 5;407(6804):651-4 [PMID: 11034217]
  17. BMC Bioinformatics. 2007;8 Suppl 6:S5 [PMID: 17903286]
  18. BMC Bioinformatics. 2014;15 Suppl 1:S4 [PMID: 24564296]
  19. Front Genet. 2012 Feb 03;3:8 [PMID: 22408642]
  20. BMC Bioinformatics. 2011;12:292 [PMID: 21771321]
  21. PLoS Comput Biol. 2013;9(11):e1003361 [PMID: 24278007]
  22. J Comput Biol. 2002;9(1):67-103 [PMID: 11911796]
  23. Nat Methods. 2012 Aug;9(8):796-804 [PMID: 22796662]
  24. BMC Bioinformatics. 2006;7:43 [PMID: 16438721]

MeSH Term

Escherichia coli
Gene Regulatory Networks
Genes, Bacterial
Genes, Fungal
Machine Learning
Saccharomyces cerevisiae
Software

Word Cloud

Created with Highcharts 10.0.0inferencemethodsdatasetsregulatoryproposedalgorithmtaskgenenetworkreconstructionmanymethodacrossalsopredictionsmultipleshowsperformancelearninglearnscombinebenefitspatternsinteractionsmodelorganismEcoliScerevisiaeresultshigh-throughputdatareceivingincreasingattentionrecentyearsconsequencesolvingliteraturerecentlyobservedhoweversingleperformsoptimallyshownintegrationrobusthighdiverseInspiredresearchpaperproposemachinesolutionapproachaddsadditionalcomplexityprocessexpectcarrysubstantialcomeautomaticadaptationoutputsindividualpossibleidentifyreliablyoccurarticledemonstratestermsaccuracyreconstructednetworksexploitsiterativesemi-supervisedensemble-basedpredicteddifferentmulti-viewsettingempiricalevaluationprokaryoticeukaryoticclearlyimprovedstateartindicaterealdifficultsoftwareusedexperimentsavailabledownloadfollowinglink:http://figsharecom/articles/Semi_supervised_Multi_View_Learning_for_Gene_Network_Reconstruction/1604827Semi-SupervisedMulti-ViewLearningGeneNetworkReconstruction

Similar Articles

Cited By