Smooth skyride through a rough skyline: Bayesian coalescent-based inference of population dynamics.

Vladimir N Minin, Erik W Bloomquist, Marc A Suchard
Author Information
  1. Vladimir N Minin: Department of Statistics, University of Washington, USA. vminin@u.washington.edu

Abstract

Kingman's coalescent process opens the door for estimation of population genetics model parameters from molecular sequences. One paramount parameter of interest is the effective population size. Temporal variation of this quantity characterizes the demographic history of a population. Because researchers are rarely able to choose a priori a deterministic model describing effective population size dynamics for data at hand, nonparametric curve-fitting methods based on multiple change-point (MCP) models have been developed. We propose an alternative to change-point modeling that exploits Gaussian Markov random fields to achieve temporal smoothing of the effective population size in a Bayesian framework. The main advantage of our approach is that, in contrast to MCP models, the explicit temporal smoothing does not require strong prior decisions. To approximate the posterior distribution of the population dynamics, we use efficient, fast mixing Markov chain Monte Carlo algorithms designed for highly structured Gaussian models. In a simulation study, we demonstrate that the proposed temporal smoothing method, named Bayesian skyride, successfully recovers "true" population size trajectories in all simulation scenarios and competes well with the MCP approaches without evoking strong prior assumptions. We apply our Bayesian skyride method to 2 real data sets. We analyze sequences of hepatitis C virus contemporaneously sampled in Egypt, reproducing all key known aspects of the viral population dynamics. Next, we estimate the demographic histories of human influenza A hemagglutinin sequences, serially sampled throughout 3 flu seasons.

References

  1. J Mol Evol. 1985;22(2):160-74 [PMID: 3934395]
  2. PLoS Biol. 2006 May;4(5):e88 [PMID: 16683862]
  3. Lancet. 2000 Mar 11;355(9207):887-91 [PMID: 10752705]
  4. J Virol. 2004 Oct;78(20):11296-302 [PMID: 15452249]
  5. Theor Popul Biol. 1983 Apr;23(2):183-201 [PMID: 6612631]
  6. Biometrics. 2003 Jun;59(2):305-16 [PMID: 12926715]
  7. Nature. 2005 Oct 20;437(7062):1162-6 [PMID: 16208317]
  8. Genetics. 2002 Jul;161(3):1307-20 [PMID: 12136032]
  9. Theor Popul Biol. 1997 Jun;51(3):210-37 [PMID: 9245777]
  10. J Virol. 1999 Dec;73(12):10489-502 [PMID: 10559367]
  11. Syst Biol. 2003 Feb;52(1):48-54 [PMID: 12554439]
  12. Mol Biol Evol. 2001 Dec;18(12):2298-305 [PMID: 11719579]
  13. BMC Evol Biol. 2005 Jan 21;5:6 [PMID: 15663782]
  14. Science. 2001 Jun 22;292(5525):2323-5 [PMID: 11423661]
  15. Genetics. 1998 May;149(1):429-34 [PMID: 9584114]
  16. Annu Rev Genet. 2002;36:305-32 [PMID: 12429695]
  17. Science. 2004 Nov 26;306(5701):1561-5 [PMID: 15567864]
  18. Bioinformatics. 2007 Jan 15;23(2):169-76 [PMID: 17110369]
  19. Science. 2006 Jan 27;311(5760):538-41 [PMID: 16439664]
  20. BMC Evol Biol. 2007 Nov 08;7:214 [PMID: 17996036]
  21. Mol Biol Evol. 2003 Mar;20(3):381-7 [PMID: 12644558]
  22. Philos Trans R Soc Lond B Biol Sci. 1994 Jun 29;344(1310):403-10 [PMID: 7800710]
  23. Theor Popul Biol. 2005 Jul;68(1):65-75 [PMID: 15927223]
  24. Mol Biol Evol. 1998 Dec;15(12):1647-57 [PMID: 9866200]
  25. Stat Med. 1995 Nov 15-30;14(21-22):2411-31 [PMID: 8711278]
  26. Genet Res. 1992 Apr;59(2):139-47 [PMID: 1628818]
  27. Bioinformatics. 2002 Oct;18(10):1404-5 [PMID: 12376389]
  28. Genetics. 2000 Jul;155(3):1429-37 [PMID: 10880500]
  29. Mol Biol Evol. 2005 May;22(5):1185-92 [PMID: 15703244]
  30. Comput Appl Biosci. 1997 Jun;13(3):235-8 [PMID: 9183526]
  31. J Math Biol. 1990;29(1):59-75 [PMID: 2277236]
  32. Proc Natl Acad Sci U S A. 1997 Jul 22;94(15):7712-8 [PMID: 9223253]

Grants

  1. T32 AI007370/NIAID NIH HHS
  2. AI07370/NIAID NIH HHS

MeSH Term

Bayes Theorem
Egypt
Genetics, Population
Hepacivirus
Humans
Markov Chains
Mathematics
Models, Genetic
Models, Statistical
Orthomyxoviridae
Phylogeny
Population Density
Population Dynamics

Word Cloud

Created with Highcharts 10.0.0populationsizedynamicsBayesiansequenceseffectiveMCPmodelstemporalsmoothingskyridemodeldemographicdatachange-pointGaussianMarkovstrongpriorsimulationmethodsampledKingman'scoalescentprocessopensdoorestimationgeneticsparametersmolecularOneparamountparameterinterestTemporalvariationquantitycharacterizeshistoryresearchersrarelyablechoosepriorideterministicdescribinghandnonparametriccurve-fittingmethodsbasedmultipledevelopedproposealternativemodelingexploitsrandomfieldsachieveframeworkmainadvantageapproachcontrastexplicitrequiredecisionsapproximateposteriordistributionuseefficientfastmixingchainMonteCarloalgorithmsdesignedhighlystructuredstudydemonstrateproposednamedsuccessfullyrecovers"true"trajectoriesscenarioscompeteswellapproacheswithoutevokingassumptionsapply2realsetsanalyzehepatitisCviruscontemporaneouslyEgyptreproducingkeyknownaspectsviralNextestimatehistorieshumaninfluenzahemagglutininseriallythroughout3fluseasonsSmoothroughskyline:coalescent-basedinference

Similar Articles

Cited By