Phylogenetic inference via sequential Monte Carlo.

Alexandre Bouchard-Côté, Sriram Sankararaman, Michael I Jordan
Author Information
  1. Alexandre Bouchard-Côté: Department of Statistics, University of British Columbia, Vancouver, BC V6T 1Z2, Canada.

Abstract

Bayesian inference provides an appealing general framework for phylogenetic analysis, able to incorporate a wide variety of modeling assumptions and to provide a coherent treatment of uncertainty. Existing computational approaches to bayesian inference based on Markov chain Monte Carlo (MCMC) have not, however, kept pace with the scale of the data analysis problems in phylogenetics, and this has hindered the adoption of bayesian methods. In this paper, we present an alternative to MCMC based on Sequential Monte Carlo (SMC). We develop an extension of classical SMC based on partially ordered sets and show how to apply this framework--which we refer to as PosetSMC--to phylogenetic analysis. We provide a theoretical treatment of PosetSMC and also present experimental evaluation of PosetSMC on both synthetic and real data. The empirical results demonstrate that PosetSMC is a very promising alternative to MCMC, providing up to two orders of magnitude faster convergence. We discuss other factors favorable to the adoption of PosetSMC in phylogenetics, including its ability to estimate marginal likelihoods, its ready implementability on parallel and distributed computing platforms, and the possibility of combining with MCMC in hybrid MCMC-SMC schemes. Software for PosetSMC is available at http://www.stat.ubc.ca/ bouchard/PosetSMC.

References

  1. J Mol Evol. 1980 Dec;16(2):111-20 [PMID: 7463489]
  2. Genetics. 2011 Apr;187(4):1115-28 [PMID: 21270390]
  3. Science. 2008 Feb 22;319(5866):1100-4 [PMID: 18292342]
  4. Syst Biol. 2008 Feb;57(1):86-103 [PMID: 18278678]
  5. Proc Natl Acad Sci U S A. 2003 Dec 23;100(26):15324-8 [PMID: 14663152]
  6. Mol Biol Evol. 2004 Mar;21(3):468-88 [PMID: 14660683]
  7. Genetics. 2000 Apr;154(4):1879-92 [PMID: 10747076]
  8. Mol Biol Evol. 1998 Dec;15(12):1647-57 [PMID: 9866200]
  9. Syst Biol. 2006 Apr;55(2):195-207 [PMID: 16522570]
  10. Science. 2001 Dec 14;294(5550):2310-4 [PMID: 11743192]
  11. Ann Appl Stat. 2010;4(4):1722-1748 [PMID: 26681992]
  12. Mol Biol Evol. 1994 May;11(3):459-68 [PMID: 8015439]
  13. J Mol Evol. 2001 Dec;53(6):711-23 [PMID: 11677631]
  14. Syst Biol. 2005 Jun;54(3):401-18 [PMID: 16012107]
  15. J Mol Evol. 1981;17(6):368-76 [PMID: 7288891]
  16. BMC Bioinformatics. 2005 Mar 21;6:63 [PMID: 15780137]
  17. Mol Biol Evol. 2011 Jan;28(1):523-32 [PMID: 20801907]
  18. PLoS Biol. 2006 May;4(5):e88 [PMID: 16683862]
  19. Syst Biol. 2011 Mar;60(2):150-60 [PMID: 21187451]
  20. Mol Biol Evol. 1987 Jul;4(4):406-25 [PMID: 3447015]
  21. Bioinformatics. 2001 Aug;17(8):754-5 [PMID: 11524383]
  22. Am J Hum Genet. 1973 Sep;25(5):471-92 [PMID: 4741844]
  23. Bioinformatics. 2009 Jun 1;25(11):1370-6 [PMID: 19369496]
  24. J Mol Evol. 1996 Sep;43(3):304-11 [PMID: 8703097]
  25. Genetics. 2002 Dec;162(4):2025-35 [PMID: 12524368]
  26. Bioinformatics. 2004 Feb 12;20(3):407-15 [PMID: 14960467]
  27. Bioinformatics. 2005 Apr 1;21(7):969-74 [PMID: 15513992]
  28. Phys Rev Lett. 1986 Nov 24;57(21):2607-2609 [PMID: 10033814]
  29. BMC Bioinformatics. 2002;3:2 [PMID: 11869452]

Grants

  1. R01 GM071749/NIGMS NIH HHS

MeSH Term

Algorithms
Bayes Theorem
Gene Frequency
Humans
Markov Chains
Models, Genetic
Monte Carlo Method
Phylogeny
RNA, Ribosomal, 16S
Software

Chemicals

RNA, Ribosomal, 16S

Word Cloud

Created with Highcharts 10.0.0PosetSMCMCMCinferenceanalysisbasedMonteCarlophylogeneticprovidetreatmentbayesiandataphylogeneticsadoptionpresentalternativeSMCBayesianprovidesappealinggeneralframeworkableincorporatewidevarietymodelingassumptionscoherentuncertaintyExistingcomputationalapproachesMarkovchainhoweverkeptpacescaleproblemshinderedmethodspaperSequentialdevelopextensionclassicalpartiallyorderedsetsshowapplyframework--whichreferPosetSMC--totheoreticalalsoexperimentalevaluationsyntheticrealempiricalresultsdemonstratepromisingprovidingtwoordersmagnitudefasterconvergencediscussfactorsfavorableincludingabilityestimatemarginallikelihoodsreadyimplementabilityparalleldistributedcomputingplatformspossibilitycombininghybridMCMC-SMCschemesSoftwareavailablehttp://wwwstatubcca/bouchard/PosetSMCPhylogeneticviasequential

Similar Articles

Cited By (23)