New Routes to Phylogeography: A Bayesian Structured Coalescent Approximation.

Nicola De Maio, Chieh-Hsi Wu, Kathleen M O'Reilly, Daniel Wilson
Author Information
  1. Nicola De Maio: Institute for Emerging Infections, Oxford Martin School, Oxford, United Kingdom; Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom.
  2. Chieh-Hsi Wu: Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom.
  3. Kathleen M O'Reilly: MRC Centre for Outbreak Analysis and Modelling, School of Public Health, Faculty of Medicine, Imperial College London, London, United Kingdom.
  4. Daniel Wilson: Institute for Emerging Infections, Oxford Martin School, Oxford, United Kingdom; Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom; Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom.

Abstract

Phylogeographic methods aim to infer migration trends and the history of sampled lineages from genetic data. Applications of phylogeography are broad, and in the context of pathogens include the reconstruction of transmission histories and the origin and emergence of outbreaks. Phylogeographic inference based on bottom-up population genetics models is computationally expensive, and as a result faster alternatives based on the evolution of discrete traits have become popular. In this paper, we show that inference of migration rates and root locations based on discrete trait models is extremely unreliable and sensitive to biased sampling. To address this problem, we introduce BASTA (BAyesian STructured coalescent Approximation), a new approach implemented in BEAST2 that combines the accuracy of methods based on the structured coalescent with the computational efficiency required to handle more than just few populations. We illustrate the potentially severe implications of poor model choice for phylogeographic analyses by investigating the zoonotic transmission of Ebola virus. Whereas the structured coalescent analysis correctly infers that successive human Ebola outbreaks have been seeded by a large unsampled non-human reservoir population, the discrete trait analysis implausibly concludes that undetected human-to-human transmission has allowed the virus to persist over the past four decades. As genomics takes on an increasingly prominent role informing the control and prevention of infectious diseases, it will be vital that phylogeographic inference provides robust insights into transmission history.

References

  1. Genetics. 1999 Jun;152(2):763-73 [PMID: 10353916]
  2. Proc Natl Acad Sci U S A. 2001 Apr 10;98(8):4563-8 [PMID: 11287657]
  3. Nat Rev Genet. 2003 Jul;4(7):535-43 [PMID: 12838345]
  4. Genetics. 2004 Dec;168(4):2407-20 [PMID: 15611198]
  5. Bioinformatics. 2006 Feb 1;22(3):341-5 [PMID: 16317072]
  6. Nature. 2005 Dec 1;438(7068):575-6 [PMID: 16319873]
  7. Genetics. 1931 Mar;16(2):97-159 [PMID: 17246615]
  8. Genetics. 1964 Apr;49(4):561-76 [PMID: 17248204]
  9. Evol Bioinform Online. 2007 Feb 12;2:227-35 [PMID: 19455215]
  10. PLoS Comput Biol. 2009 Sep;5(9):e1000520 [PMID: 19779555]
  11. Mol Ecol. 2010 Feb;19(3):431-5 [PMID: 20070519]
  12. Mol Biol Evol. 2010 Aug;27(8):1877-85 [PMID: 20203288]
  13. Proc Natl Acad Sci U S A. 2010 Mar 23;107(12):5675-80 [PMID: 20212118]
  14. Bioinformatics. 2010 Aug 15;26(16):2064-5 [PMID: 20591904]
  15. Trends Ecol Evol. 2010 Nov;25(11):626-32 [PMID: 20863591]
  16. Syst Biol. 2011 Jan;60(1):3-15 [PMID: 20952756]
  17. PLoS Pathog. 2010 Oct 28;6(10):e1001164 [PMID: 21060815]
  18. J Virol. 2011 Mar;85(6):2964-74 [PMID: 21159871]
  19. Trends Ecol Evol. 1998 Sep 1;13(9):361-6 [PMID: 21238344]
  20. Curr Biol. 2011 Aug 9;21(15):1251-8 [PMID: 21737280]
  21. Infect Genet Evol. 2011 Dec;11(8):1825-41 [PMID: 21906695]
  22. Genetics. 2012 Jan;190(1):187-201 [PMID: 22042576]
  23. J Gen Virol. 2012 Apr;93(Pt 4):889-99 [PMID: 22190015]
  24. Syst Biol. 2012 May;61(3):443-60 [PMID: 22228799]
  25. Mol Biol Evol. 2012 Jun;29(6):1533-43 [PMID: 22319149]
  26. Mol Biol Evol. 2012 Aug;29(8):1969-73 [PMID: 22367748]
  27. J Math Biol. 1990;29(1):59-75 [PMID: 2277236]
  28. Science. 2012 Aug 24;337(6097):957-60 [PMID: 22923579]
  29. Genome Biol. 2012 Dec 21;13(12):R118 [PMID: 23259504]
  30. MBio. 2013 Aug 13;4(4):null [PMID: 23943757]
  31. J Infect Dis. 2014 May 15;209(10):1642-52 [PMID: 24302756]
  32. Lancet Infect Dis. 2014 Mar;14(3):220-6 [PMID: 24462211]
  33. PLoS Pathog. 2014 Feb 20;10(2):e1003932 [PMID: 24586153]
  34. PLoS Comput Biol. 2014 Apr 10;10(4):e1003537 [PMID: 24722319]
  35. PLoS Comput Biol. 2014 Apr 17;10(4):e1003570 [PMID: 24743590]
  36. Bioinformatics. 2014 Aug 15;30(16):2272-9 [PMID: 24753484]
  37. Elife. 2014 Sep 08;3:e04395 [PMID: 25201877]
  38. Science. 2014 Sep 12;345(6202):1369-72 [PMID: 25214632]
  39. PLoS One. 2014 Sep 16;9(9):e107330 [PMID: 25226523]
  40. Genetics. 1987 Oct;117(2):343-51 [PMID: 2822535]
  41. Mol Ecol. 2010 Feb;19(3):436-446 [PMID: 29284924]
  42. Genetics. 1968 Aug;59(4):565-92 [PMID: 5708302]
  43. J Mol Evol. 1981;17(6):368-76 [PMID: 7288891]
  44. Genetics. 1993 Jun;134(2):659-69 [PMID: 8100789]
  45. Comput Appl Biosci. 1997 Jun;13(3):235-8 [PMID: 9183526]

Grants

  1. /Wellcome Trust
  2. 101237/Wellcome Trust
  3. MR/J014362/1/Medical Research Council

MeSH Term

Algorithms
Animal Migration
Animals
Bayes Theorem
Birds
Disease Outbreaks
Ebolavirus
Evolution, Molecular
Genetic Variation
Hemorrhagic Fever, Ebola
Humans
Influenza A virus
Influenza in Birds
Models, Genetic
Phylogeny
Phylogeography
Plant Diseases
Plant Viruses
Zoonoses

Word Cloud

Created with Highcharts 10.0.0transmissionbasedinferencediscretecoalescentPhylogeographicmethodsmigrationhistoryoutbreakspopulationmodelstraitApproximationstructuredphylogeographicEbolavirusanalysisaiminfertrendssampledlineagesgeneticdataApplicationsphylogeographybroadcontextpathogensincludereconstructionhistoriesoriginemergencebottom-upgeneticscomputationallyexpensiveresultfasteralternativesevolutiontraitsbecomepopularpapershowratesrootlocationsextremelyunreliablesensitivebiasedsamplingaddressproblemintroduceBASTABAyesianSTructurednewapproachimplementedBEAST2combinesaccuracycomputationalefficiencyrequiredhandlejustpopulationsillustratepotentiallysevereimplicationspoormodelchoiceanalysesinvestigatingzoonoticWhereascorrectlyinferssuccessivehumanseededlargeunsamplednon-humanreservoirimplausiblyconcludesundetectedhuman-to-humanallowedpersistpastfourdecadesgenomicstakesincreasinglyprominentroleinformingcontrolpreventioninfectiousdiseaseswillvitalprovidesrobustinsightsNewRoutesPhylogeography:BayesianStructuredCoalescent

Similar Articles

Cited By