TreeFix: statistically informed gene tree error correction using species trees.

Yi-Chieh Wu, Matthew D Rasmussen, Mukul S Bansal, Manolis Kellis
Author Information
  1. Yi-Chieh Wu: Department of Electrical Engineering and Computer Science, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

Abstract

Accurate gene tree reconstruction is a fundamental problem in phylogenetics, with many important applications. However, sequence data alone often lack enough information to confidently support one gene tree topology over many competing alternatives. Here, we present a novel framework for combining sequence data and species tree information, and we describe an implementation of this framework in TreeFix, a new phylogenetic program for improving gene tree reconstructions. Given a gene tree (preferably computed using a maximum-likelihood phylogenetic program), TreeFix finds a "statistically equivalent" gene tree that minimizes a species tree-based cost function. We have applied TreeFix to 2 clades of 12 Drosophila and 16 fungal genomes, as well as to simulated phylogenies and show that it dramatically improves reconstructions compared with current state-of-the-art programs. Given its accuracy, speed, and simplicity, TreeFix should be applicable to a wide range of analyses and have many important implications for future investigations of gene evolution. The source code and a sample data set are available at http://compbio.mit.edu/treefix.

References

  1. Bioinformatics. 2001 Dec;17(12):1246-7 [PMID: 11751242]
  2. Mol Biol Evol. 2011 Jan;28(1):273-90 [PMID: 20660489]
  3. Algorithms Mol Biol. 2010 Feb 03;5:16 [PMID: 20181081]
  4. Nature. 2011 Jan 6;469(7328):93-6 [PMID: 21170026]
  5. Genome Res. 2012 Apr;22(4):755-65 [PMID: 22271778]
  6. Genome Biol. 2007;8(7):R141 [PMID: 17634151]
  7. PLoS Genet. 2007 Nov;3(11):e197 [PMID: 17997610]
  8. Syst Biol. 2011 Mar;60(2):117-25 [PMID: 21186249]
  9. Mol Biol Evol. 1997 Jul;14(7):685-95 [PMID: 9254330]
  10. Trends Ecol Evol. 2009 Jun;24(6):332-40 [PMID: 19307040]
  11. Bioinformatics. 2009 Jun 1;25(11):1370-6 [PMID: 19369496]
  12. Nature. 2007 Sep 6;449(7158):54-61 [PMID: 17805289]
  13. Bioinformatics. 2006 Nov 1;22(21):2688-90 [PMID: 16928733]
  14. Bioinformatics. 2001 Sep;17(9):821-8 [PMID: 11590098]
  15. Syst Biol. 2000 Dec;49(4):652-70 [PMID: 12116432]
  16. Mol Biol Evol. 2011 Nov;28(11):3009-18 [PMID: 21633114]
  17. J Mol Evol. 1989 Aug;29(2):170-9 [PMID: 2509717]
  18. Genome Res. 2009 Feb;19(2):327-35 [PMID: 19029536]
  19. IEEE/ACM Trans Comput Biol Bioinform. 2011 Mar-Apr;8(2):517-35 [PMID: 21233529]
  20. J Comput Biol. 2000;7(3-4):429-47 [PMID: 11108472]
  21. Mol Phylogenet Evol. 2000 Jan;14(1):89-106 [PMID: 10631044]
  22. J Comput Biol. 2006 Mar;13(2):320-35 [PMID: 16597243]
  23. Nat Rev Genet. 2005 May;6(5):361-75 [PMID: 15861208]
  24. Syst Biol. 2003 Oct;52(5):696-704 [PMID: 14530136]
  25. Syst Biol. 2010 Jul;59(4):465-76 [PMID: 20547782]
  26. Bioinformatics. 2003 Aug 12;19(12):1572-4 [PMID: 12912839]
  27. Nature. 2009 Jun 4;459(7247):657-62 [PMID: 19465905]
  28. Nature. 2011 May 5;473(7345):97-100 [PMID: 21478875]
  29. Genome Res. 1998 Mar;8(3):163-7 [PMID: 9521918]
  30. Genome Res. 2007 Dec;17(12):1932-42 [PMID: 17989260]

MeSH Term

Animals
Classification
Drosophila
Fungi
Phylogeny
Reproducibility of Results
Software

Word Cloud

Created with Highcharts 10.0.0genetreeTreeFixmanydataspeciesimportantsequenceinformationframeworkphylogeneticprogramreconstructionsGivenusingAccuratereconstructionfundamentalproblemphylogeneticsapplicationsHoweveraloneoftenlackenoughconfidentlysupportonetopologycompetingalternativespresentnovelcombiningdescribeimplementationnewimprovingpreferablycomputedmaximum-likelihoodfinds"statisticallyequivalent"minimizestree-basedcostfunctionapplied2clades12Drosophila16fungalgenomeswellsimulatedphylogeniesshowdramaticallyimprovescomparedcurrentstate-of-the-artprogramsaccuracyspeedsimplicityapplicablewiderangeanalysesimplicationsfutureinvestigationsevolutionsourcecodesamplesetavailablehttp://compbiomitedu/treefixTreeFix:statisticallyinformederrorcorrectiontrees

Similar Articles

Cited By (48)