Estimating the distribution of selection coefficients from phylogenetic data using sitewise mutation-selection models.

Asif U Tamuri, Mario dos Reis, Richard A Goldstein
Author Information
  1. Asif U Tamuri: Medical Research Council National Institute for Medical Research, London, NW7 1AA, United Kingdom.

Abstract

Estimation of the distribution of selection coefficients of mutations is a long-standing issue in molecular evolution. In addition to population-based methods, the distribution can be estimated from DNA sequence data by phylogenetic-based models. Previous models have generally found unimodal distributions where the probability mass is concentrated between mildly deleterious and nearly neutral mutations. Here we use a sitewise mutation-selection phylogenetic model to estimate the distribution of selection coefficients among novel and fixed mutations (substitutions) in a data set of 244 mammalian mitochondrial genomes and a set of 401 PB2 proteins from influenza. We find a bimodal distribution of selection coefficients for novel mutations in both the mitochondrial data set and for the influenza protein evolving in its natural reservoir, birds. Most of the mutations are strongly deleterious with the rest of the probability mass concentrated around mildly deleterious to neutral mutations. The distribution of the coefficients among substitutions is unimodal and symmetrical around nearly neutral substitutions for both data sets at adaptive equilibrium. About 0.5% of the nonsynonymous mutations and 14% of the nonsynonymous substitutions in the mitochondrial proteins are advantageous, with 0.5% and 24% observed for the influenza protein. Following a host shift of influenza from birds to humans, however, we find among novel mutations in PB2 a trimodal distribution with a small mode of advantageous mutations.

References

  1. Genetics. 2003 Apr;163(4):1519-26 [PMID: 12702694]
  2. Protein Sci. 1994 Oct;3(10):1706-11 [PMID: 7849587]
  3. Genetics. 2001 Jul;158(3):1227-34 [PMID: 11454770]
  4. Genetics. 1996 Oct;144(2):635-45 [PMID: 8889526]
  5. Genet Res. 1966 Dec;8(3):269-94 [PMID: 5980116]
  6. Mol Biol Evol. 2003 Aug;20(8):1231-9 [PMID: 12777508]
  7. J Mol Evol. 2001 Dec;53(6):711-23 [PMID: 11677631]
  8. Proteins. 1998 Aug 15;32(3):289-95 [PMID: 9715905]
  9. Philos Trans R Soc Lond B Biol Sci. 2008 Dec 27;363(1512):4013-21 [PMID: 18852108]
  10. Mol Biol Evol. 2001 May;18(5):866-73 [PMID: 11319270]
  11. Science. 2008 Jun 20;320(5883):1632-5 [PMID: 18566285]
  12. Mol Biol Evol. 2007 Aug;24(8):1667-77 [PMID: 17470435]
  13. J Mol Evol. 1985;22(2):160-74 [PMID: 3934395]
  14. Mol Biol Evol. 2011 Jun;28(6):1755-67 [PMID: 21109586]
  15. J Mol Evol. 1981;17(6):368-76 [PMID: 7288891]
  16. Proc Natl Acad Sci U S A. 2004 Jun 1;101(22):8396-401 [PMID: 15159545]
  17. Nature. 2005 Oct 6;437(7060):889-93 [PMID: 16208372]
  18. Evolution. 1984 Sep;38(5):1116-1129 [PMID: 28555784]
  19. Genetics. 1931 Mar;16(2):97-159 [PMID: 17246615]
  20. Nature. 2008 May 29;453(7195):615-9 [PMID: 18418375]
  21. Genetics. 1978 Oct;90(2):349-82 [PMID: 17248867]
  22. Proc Natl Acad Sci U S A. 2010 Mar 9;107(10):4629-34 [PMID: 20176949]
  23. Mol Biol Evol. 2006 Dec;23(12):2283-7 [PMID: 16982819]
  24. Nature. 2002 Apr 4;416(6880):531-4 [PMID: 11932744]
  25. J Mol Evol. 2003;57 Suppl 1:S154-64 [PMID: 15008412]
  26. Mol Biol Evol. 2008 Mar;25(3):568-79 [PMID: 18178545]
  27. Proc Natl Acad Sci U S A. 2003 Sep 2;100(18):10335-40 [PMID: 12925735]
  28. Genetics. 1991 Nov;129(3):897-907 [PMID: 1752426]
  29. Genetics. 1969 Apr;61(4):893-903 [PMID: 5364968]
  30. Mol Biol Evol. 2002 Dec;19(12):2142-9 [PMID: 12446806]
  31. Mol Biol Evol. 2010 Jul;27(7):1546-60 [PMID: 20159780]
  32. Genetics. 1992 Dec;132(4):1161-76 [PMID: 1459433]
  33. Genetics. 2003 Nov;165(3):1269-78 [PMID: 14668381]
  34. Nature. 1968 Feb 17;217(5129):624-6 [PMID: 5637732]
  35. Mol Biol Evol. 2007 Aug;24(8):1586-91 [PMID: 17483113]
  36. Genetics. 2001 Oct;159(2):441-52 [PMID: 11606524]
  37. Mol Biol Evol. 1994 Mar;11(2):316-24 [PMID: 8170371]
  38. Philos Trans R Soc Lond B Biol Sci. 2000 Nov 29;355(1403):1553-62 [PMID: 11127900]
  39. Nature. 2005 Sep 1;437(7055):69-87 [PMID: 16136131]
  40. Mol Biol Evol. 2007 Jul;24(7):1464-79 [PMID: 17400572]
  41. Genetics. 2005 Aug;170(4):1449-57 [PMID: 15944361]
  42. Gene. 1999 Sep 30;238(1):39-51 [PMID: 10570982]
  43. Nature. 1973 Nov 9;246(5428):96-8 [PMID: 4585855]
  44. Mol Biol Evol. 2006 Sep;23(9):1762-75 [PMID: 16787998]
  45. Proc Natl Acad Sci U S A. 2007 Apr 17;104(16):6504-10 [PMID: 17409186]
  46. J Math Biol. 1995;34(1):95-109 [PMID: 8568423]
  47. J Mol Evol. 1994 Jul;39(1):105-11 [PMID: 8064867]
  48. J Mol Evol. 2009 Oct;69(4):333-45 [PMID: 19787384]
  49. Mol Biol Evol. 1996 May;13(5):650-9 [PMID: 8676739]
  50. PLoS Comput Biol. 2009 Nov;5(11):e1000564 [PMID: 19911053]
  51. J Mol Evol. 2006 May;62(5):551-63 [PMID: 16557338]
  52. J Mol Evol. 1991 Dec;33(6):543-55 [PMID: 1685753]
  53. Mol Biol Evol. 2009 Feb;26(2):451-61 [PMID: 19033257]
  54. J Virol. 2008 May;82(10):4807-11 [PMID: 18353939]
  55. Bioinformatics. 2005 Feb 15;21(4):456-63 [PMID: 15608047]
  56. Nat Rev Genet. 2007 Aug;8(8):610-8 [PMID: 17637733]
  57. Proc Natl Acad Sci U S A. 2011 May 10;108(19):7896-901 [PMID: 21464309]
  58. Syst Biol. 2011 Mar;60(2):161-74 [PMID: 21233085]
  59. Mol Biol Evol. 1998 Jul;15(7):910-7 [PMID: 9656490]
  60. Genet Res. 1974 Feb;23(1):23-35 [PMID: 4407212]
  61. Genetics. 2006 Jun;173(2):891-900 [PMID: 16547091]
  62. Syst Biol. 2011 May;60(3):276-90 [PMID: 21398626]
  63. Hum Mol Genet. 2005 Nov 1;14(21):3191-201 [PMID: 16174645]

Grants

  1. /Wellcome Trust
  2. MC_U117573805/Medical Research Council
  3. /Biotechnology and Biological Sciences Research Council

MeSH Term

Algorithms
Animals
Computer Simulation
Evolution, Molecular
Genetic Drift
Humans
Models, Genetic
Mutation
Phylogeny
Reproducibility of Results
Selection, Genetic

Word Cloud

Created with Highcharts 10.0.0mutationsdistributioncoefficientsdataselectionsubstitutionsinfluenzamodelsdeleteriousneutralamongnovelsetmitochondrialunimodalprobabilitymassconcentratedmildlynearlysitewisemutation-selectionphylogeneticPB2proteinsfindproteinbirdsaround05%nonsynonymousadvantageousEstimationlong-standingissuemolecularevolutionadditionpopulation-basedmethodscanestimatedDNAsequencephylogenetic-basedPreviousgenerallyfounddistributionsusemodelestimatefixed244mammaliangenomes401bimodalevolvingnaturalreservoirstronglyrestsymmetricalsetsadaptiveequilibrium14%24%observedFollowinghostshifthumanshowevertrimodalsmallmodeEstimatingusing

Similar Articles

Cited By (59)