Computing Ka and Ks with a consideration of unequal transitional substitutions.

Zhang Zhang, Jun Li, Jun Yu
Author Information
  1. Zhang Zhang: Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China. zhangzhang@genomics.org.cn

Abstract

BACKGROUND: Approximate methods for estimating nonsynonymous and synonymous substitution rates (Ka and Ks) among protein-coding sequences have adopted different mutation (substitution) models. In the past two decades, several methods have been proposed but they have not considered unequal transitional substitutions (between the two purines, A and G, or the two pyrimidines, T and C) that become apparent when sequences data to be compared are vast and significantly diverged.
RESULTS: We propose a new method (MYN), a modified version of the Yang-Nielsen algorithm (YN), for evolutionary analysis of protein-coding sequences in general. MYN adopts the Tamura-Nei Model that considers the difference among rates of transitional and transversional substitutions as well as factors in codon frequency bias. We evaluate the performance of MYN by comparing to other methods, especially to YN, and to show that MYN has minimal deviations when parameters vary within normal ranges defined by empirical data.
CONCLUSION: Our comparative results deriving from consistency analysis, computer simulations and authentic datasets, indicate that ignoring unequal transitional rates may lead to serious biases and that MYN performs well in most of the tested cases. These results also suggest that acquisitions of reliable synonymous and nonsynonymous substitution rates primarily depend on less biased estimates of transition/transversion rate ratio.

References

  1. Mol Biol Evol. 1986 Sep;3(5):418-26 pubmed:3444411
  2. J Mol Evol. 1985;22(2):160-74 pubmed:3934395
  3. J Mol Evol. 1993 Jan;36(1):96-9 pubmed:8433381
  4. Mol Biol Evol. 1993 Mar;10(2):271-81 pubmed:8487630
  5. Mol Biol Evol. 1993 May;10(3):512-26 pubmed:8336541
  6. Mol Biol Evol. 1994 Sep;11(5):715-24 pubmed:7968485
  7. Mol Biol Evol. 1994 Sep;11(5):725-36 pubmed:7968486
  8. J Mol Evol. 1995 Feb;40(2):190-226 pubmed:7699723
  9. J Mol Evol. 1995 Dec;41(6):1152-9 pubmed:8587111
  10. Mol Biol Evol. 1996 Jan;13(1):105-14 pubmed:8583885
  11. Nature. 1997 Jan 9;385(6612):151-4 pubmed:8990116
  12. Comput Appl Biosci. 1997 Oct;13(5):555-6 pubmed:9367129
  13. J Mol Evol. 1998 Apr;46(4):409-18 pubmed:9541535
  14. Genome Res. 1998 Dec;8(12):1233-44 pubmed:9872979
  15. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D447-53 pubmed:15608235
  16. PLoS Biol. 2005 Feb;3(2):e38 pubmed:15685292
  17. Mol Biol Evol. 2000 Jan;17(1):32-43 pubmed:10666704
  18. Mol Biol Evol. 2000 Aug;17(8):1251-8 pubmed:10908645
  19. Genome Res. 2002 Jan;12(1):198-202 pubmed:11779845
  20. Mol Biol Evol. 2004 Dec;21(12):2290-8 pubmed:15329386
  21. J Mol Evol. 1980 Dec;16(2):111-20 pubmed:7463489
  22. Mol Biol Evol. 1985 Mar;2(2):150-74 pubmed:3916709

MeSH Term

Algorithms
Amino Acid Substitution
Biological Evolution
Computer Simulation
Genetics, Medical
Humans
Models, Genetic
Oryza
Point Mutation
Sequence Alignment
Sequence Analysis, DNA