Efficient tree searches with available algorithms.

Gonzalo Giribet
Author Information
  1. Gonzalo Giribet: Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, U.S.A. ggiribet@oeb.harvard.edu

Abstract

Phylogenetic methods based on optimality criteria are highly desirable for their logic properties, but time-consuming when compared to other methods of tree construction. Traditionally, researchers have been limited to exploring tree space by using multiple replicates of Wagner addition followed by typical hill climbing algorithms such as SPR or/and TBR branch swapping but these methods have been shown to be insufficient for "large" data sets (or even for small data sets with a complex tree space). Here, I review different algorithms and search strategies used for phylogenetic analysis with the aim of clarifying certain aspects of this important part of the phylogenetic inference exercise. The techniques discussed here apply to both major families of methods based on optimality criteria-parsimony and maximum likelihood-and allow the thorough analysis of complex data sets with hundreds to thousands of terminal taxa. A new technique, called pre-processed searches is proposed for reusing phylogenetic results obtained in previous analyses, to increase the applicability of the previously proposed jumpstarting phylogenetics method. This article is aimed to serve as an educational and algorithmic reference to biologists interested in phylogenetic analysis.

References

  1. Cladistics. 2004 Oct;20(5):454-486 [PMID: 34892953]
  2. Syst Biol. 2006 Oct;55(5):818-36 [PMID: 17060202]
  3. Cladistics. 2006 Feb;22(1):32-58 [PMID: 34892893]
  4. Mycol Res. 2003 Aug;107(Pt 8):901-16 [PMID: 14531615]
  5. Mol Biol Evol. 2002 Oct;19(10):1717-26 [PMID: 12270898]
  6. Mol Phylogenet Evol. 2004 Aug;32(2):627-46 [PMID: 15223043]
  7. Cladistics. 1998 Dec;14(4):303-338 [PMID: 34929916]
  8. Mol Biol Evol. 1998 Mar;15(3):277-83 [PMID: 9501494]
  9. J Mol Evol. 1999 Mar;48(3):256-61 [PMID: 10627192]
  10. Syst Biol. 1997 Mar;46(1):1-68 [PMID: 11975347]
  11. Mol Phylogenet Evol. 2004 Sep;32(3):938-50 [PMID: 15288068]
  12. Mol Biol Evol. 2001 Nov;18(11):1983-92 [PMID: 11606695]
  13. Bioinformatics. 2001;17 Suppl 1:S190-8 [PMID: 11473009]
  14. Proc IEEE Comput Syst Bioinform Conf. 2004;:98-109 [PMID: 16448004]
  15. Syst Biol. 2000 Dec;49(4):817-29 [PMID: 12116443]
  16. Mol Phylogenet Evol. 2004 May;31(2):780-2 [PMID: 15062810]
  17. Proc Biol Sci. 2006 Mar 7;273(1586):531-8 [PMID: 16537123]
  18. Science. 1983 May 13;220(4598):671-80 [PMID: 17813860]
  19. Cladistics. 1999 Dec;15(4):415-428 [PMID: 34902941]
  20. Syst Biol. 2001 Feb;50(1):60-6 [PMID: 12116594]
  21. Syst Biol. 2004 Oct;53(5):685-92 [PMID: 15545249]
  22. Cladistics. 2003 Jun;19(3):254-60 [PMID: 12901382]
  23. Trends Ecol Evol. 1998 Mar;13(3):105-9 [PMID: 21238221]
  24. Mol Phylogenet Evol. 1998 Jun;9(3):481-8 [PMID: 9667996]
  25. Cladistics. 2000 Jun;16(2):204-231 [PMID: 34902954]
  26. Syst Biol. 2002 Aug;51(4):588-98 [PMID: 12228001]
  27. Mol Biol Evol. 1987 Jul;4(4):406-25 [PMID: 3447015]
  28. Syst Biol. 2003 Aug;52(4):554-64 [PMID: 12857646]
  29. Syst Biol. 2001 Feb;50(1):7-17 [PMID: 12116596]
  30. Proc Natl Acad Sci U S A. 2006 May 16;103(20):7723-8 [PMID: 16675549]
  31. Cladistics. 1997 Mar;13(1-2):21-26 [PMID: 34920632]
  32. Syst Biol. 2003 Jun;52(3):368-73 [PMID: 12775525]
  33. Cladistics. 1998 Dec;14(4):387-400 [PMID: 34929920]
  34. Pac Symp Biocomput. 1996;:512-23 [PMID: 9390255]
  35. Cladistics. 1999 Dec;15(4):407-414 [PMID: 34902938]
  36. Mol Biol Evol. 2000 Sep;17(9):1401-9 [PMID: 10958856]
  37. Proc Natl Acad Sci U S A. 2002 Aug 6;99(16):10516-21 [PMID: 12142465]
  38. Syst Biol. 2003 Oct;52(5):696-704 [PMID: 14530136]
  39. Mol Phylogenet Evol. 1999 Dec;13(3):619-23 [PMID: 10620418]
  40. Bioinformatics. 2005 Dec 15;21(24):4338-47 [PMID: 16234323]
  41. Syst Biol. 2005 Aug;54(4):660-8 [PMID: 16126660]
  42. Cladistics. 1993 Dec;9(4):433-436 [PMID: 34929981]
  43. Int J Bioinform Res Appl. 2006;2(1):19-35 [PMID: 18048151]
  44. Syst Biol. 2002 Aug;51(4):664-71 [PMID: 12228008]
  45. Syst Biol. 2006 Jun;55(3):522-9 [PMID: 16861214]
  46. Nature. 1996 Sep 12;383(6596):130-1 [PMID: 8774876]
  47. Science. 2004 Nov 12;306(5699):1172-4 [PMID: 15539599]
  48. Bioinformatics. 2004 Jan 22;20(2):274-5 [PMID: 14734321]
  49. Bioinformatics. 2006 Nov 1;22(21):2688-90 [PMID: 16928733]
  50. Bioinformatics. 2005 Feb 15;21(4):456-63 [PMID: 15608047]
  51. Mol Phylogenet Evol. 2005 Sep;36(3):554-67 [PMID: 15990341]
  52. Mol Phylogenet Evol. 2006 Mar;38(3):667-76 [PMID: 16129628]
  53. Bioinformatics. 2005 Jun;21 Suppl 1:i97-106 [PMID: 15961504]
  54. Comput Appl Biosci. 1994 Feb;10(1):41-8 [PMID: 8193955]
  55. Nature. 2001 Sep 13;413(6852):157-61 [PMID: 11557979]

Word Cloud

Created with Highcharts 10.0.0methodstreephylogeneticalgorithmsdatasetsanalysisbasedoptimalityspacecomplexsearchesproposedPhylogeneticcriteriahighlydesirablelogicpropertiestime-consumingcomparedconstructionTraditionallyresearcherslimitedexploringusingmultiplereplicatesWagneradditionfollowedtypicalhillclimbingSPRor/andTBRbranchswappingshowninsufficient"large"evensmallreviewdifferentsearchstrategiesusedaimclarifyingcertainaspectsimportantpartinferenceexercisetechniquesdiscussedapplymajorfamiliescriteria-parsimonymaximumlikelihood-andallowthoroughhundredsthousandsterminaltaxanewtechniquecalledpre-processedreusingresultsobtainedpreviousanalysesincreaseapplicabilitypreviouslyjumpstartingphylogeneticsmethodarticleaimedserveeducationalalgorithmicreferencebiologistsinterestedEfficientavailable

Similar Articles

Cited By