Sequence entropy of folding and the absolute rate of amino acid substitutions.

Richard A Goldstein, David D Pollock
Author Information
  1. Richard A Goldstein: Division of Infection and Immunity, University College London, London, WC1E 6BT, UK. ORCID
  2. David D Pollock: Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO, 80045, USA. David.Pollock@ucdenver.edu. ORCID

Abstract

Adequate representations of protein evolution should consider how the acceptance of mutations depends on the sequence context in which they arise. However, epistatic interactions among sites in a protein result in hererogeneities in the substitution rate, both temporal and spatial, that are beyond the capabilities of current models. Here we use parallels between amino acid substitutions and chemical reaction kinetics to develop an improved theory of protein evolution. We constructed a mechanistic framework for modelling amino acid substitution rates that uses the formalisms of statistical mechanics, with principles of population genetics underlying the analysis. Theoretical analyses and computer simulations of proteins under purifying selection for thermodynamic stability show that substitution rates and the stabilization of resident amino acids (the 'evolutionary Stokes shift') can be predicted from biophysics and the effect of sequence entropy alone. Furthermore, we demonstrate that substitutions predominantly occur when epistatic interactions result in near neutrality; substitution rates are determined by how often epistasis results in such nearly neutral conditions. This theory provides a general framework for modelling protein sequence change under purifying selection, potentially explains patterns of convergence and mutation rates in real proteins that are incompatible with previous models, and provides a better null model for the detection of adaptive changes.

References

  1. Mol Biol Evol. 2016 Nov;33(11):2990-3002 [PMID: 27512115]
  2. Genetics. 1962 Jun;47:713-9 [PMID: 14456043]
  3. Protein Sci. 2016 Jul;25(7):1354-62 [PMID: 27028523]
  4. Proc Natl Acad Sci U S A. 2014 Apr 15;111(15):E1450 [PMID: 24706894]
  5. Annu Rev Phys Chem. 2008;59:105-27 [PMID: 17937598]
  6. Mol Biol Evol. 1994 Sep;11(5):715-24 [PMID: 7968485]
  7. Genetics. 1998 Mar;148(3):929-36 [PMID: 9539414]
  8. Proc Natl Acad Sci U S A. 2012 May 22;109(21):E1352-9 [PMID: 22547823]
  9. Mol Biol Evol. 2004 Jun;21(6):1095-109 [PMID: 15014145]
  10. Proc Natl Acad Sci U S A. 2009 Jun 16;106(24):9564-9 [PMID: 19497876]
  11. Proteins. 1991;11(4):297-313 [PMID: 1758884]
  12. Genetics. 2013 Feb;193(2):557-64 [PMID: 23222651]
  13. Proc Natl Acad Sci U S A. 2002 Nov 12;99(23):14878-83 [PMID: 12403824]
  14. J Mol Biol. 1999 Mar 19;287(1):187-98 [PMID: 10074416]
  15. Adv Protein Chem. 1988;39:191-234 [PMID: 3072868]
  16. Adv Protein Chem. 1979;33:167-241 [PMID: 44431]
  17. Nature. 2012 Oct 25;490(7421):535-8 [PMID: 23064225]
  18. J Theor Biol. 2015 Aug 7;378:56-64 [PMID: 25936759]
  19. Nature. 2016 May 11;533(7603):397-401 [PMID: 27193686]
  20. Proteins. 2011 May;79(5):1396-407 [PMID: 21337623]
  21. J Mol Biol. 1999 Aug 6;291(1):135-47 [PMID: 10438611]
  22. Genetics. 1998 Oct;150(2):911-9 [PMID: 9755219]
  23. J Mol Evol. 1980 Dec;16(2):111-20 [PMID: 7463489]
  24. Proc Natl Acad Sci U S A. 2009 Jun 2;106(22):8986-91 [PMID: 19416880]
  25. Mol Biol Evol. 2015 Jun;32(6):1373-81 [PMID: 25737491]
  26. Bioinformatics. 2004 May 1;20(7):1129-37 [PMID: 14764549]
  27. PLoS Comput Biol. 2006 Jun 23;2(6):e69 [PMID: 16789817]
  28. Mol Biol Evol. 2015 Feb;32(2):542-54 [PMID: 25415964]
  29. PLoS Genet. 2014 May 08;10(5):e1004328 [PMID: 24811236]
  30. Proc Natl Acad Sci U S A. 2015 Jun 23;112(25):E3226-35 [PMID: 26056312]
  31. Proc Natl Acad Sci U S A. 2013 Dec 24;110(52):21071-6 [PMID: 24324165]
  32. J Theor Biol. 1988 Dec 7;135(3):265-81 [PMID: 3256719]
  33. Mol Biol Evol. 1998 Jul;15(7):910-7 [PMID: 9656490]
  34. Genetics. 2012 Mar;190(3):1101-15 [PMID: 22209901]
  35. Proc Natl Acad Sci U S A. 2005 Jul 5;102(27):9541-6 [PMID: 15980155]
  36. Genome Biol Evol. 2013;5(9):1584-93 [PMID: 23884461]
  37. Proteins. 2002 Jan 1;46(1):105-9 [PMID: 11746707]
  38. Proc Natl Acad Sci U S A. 2011 Jun 14;108(24):9916-21 [PMID: 21610162]
  39. Genetics. 2014 May;197(1):257-71 [PMID: 24532780]
  40. PLoS Comput Biol. 2009 Nov;5(11):e1000564 [PMID: 19911053]

Grants

  1. MC_U117573805/Medical Research Council
  2. MC_PC_13056/Medical Research Council
  3. BB/P007562/1/Biotechnology and Biological Sciences Research Council
  4. R01 GM097251/NIGMS NIH HHS
  5. R01 GM083127/NIGMS NIH HHS

MeSH Term

Amino Acid Substitution
Computational Biology
Entropy
Evolution, Molecular
Models, Chemical
Protein Folding
Proteins

Chemicals

Proteins

Word Cloud

Created with Highcharts 10.0.0proteinsubstitutionaminoratessequenceacidsubstitutionsevolutionepistaticinteractionsresultratemodelstheoryframeworkmodellingproteinspurifyingselectionentropyprovidesAdequaterepresentationsconsideracceptancemutationsdependscontextariseHoweveramongsiteshererogeneitiestemporalspatialbeyondcapabilitiescurrentuseparallelschemicalreactionkineticsdevelopimprovedconstructedmechanisticusesformalismsstatisticalmechanicsprinciplespopulationgeneticsunderlyinganalysisTheoreticalanalysescomputersimulationsthermodynamicstabilityshowstabilizationresidentacids'evolutionaryStokesshift'canpredictedbiophysicseffectaloneFurthermoredemonstratepredominantlyoccurnearneutralitydeterminedoftenepistasisresultsnearlyneutralconditionsgeneralchangepotentiallyexplainspatternsconvergencemutationrealincompatiblepreviousbetternullmodeldetectionadaptivechangesSequencefoldingabsolute

Similar Articles

Cited By