Codon Deviation Coefficient: a novel measure for estimating codon usage bias and its statistical significance.

Zhang Zhang, Jun Li, Peng Cui, Feng Ding, Ang Li, Jeffrey P Townsend, Jun Yu
Author Information
  1. Zhang Zhang: Computational Bioscience Research Center (CBRC), King Abdullah Universitof Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia.

Abstract

BACKGROUND: Genetic mutation, selective pressure for translational efficiency and accuracy, level of gene expression, and protein function through natural selection are all believed to lead to codon usage bias (CUB). Therefore, informative measurement of CUB is of fundamental importance to making inferences regarding gene function and genome evolution. However, extant measures of CUB have not fully accounted for the quantitative effect of background nucleotide composition and have not statistically evaluated the significance of CUB in sequence analysis.
RESULTS: Here we propose a novel measure--Codon Deviation Coefficient (CDC)--that provides an informative measurement of CUB and its statistical significance without requiring any prior knowledge. Unlike previous measures, CDC estimates CUB by accounting for background nucleotide compositions tailored to codon positions and adopts the bootstrapping to assess the statistical significance of CUB for any given sequence. We evaluate CDC by examining its effectiveness on simulated sequences and empirical data and show that CDC outperforms extant measures by achieving a more informative estimation of CUB and its statistical significance.
CONCLUSIONS: As validated by both simulated and empirical data, CDC provides a highly informative quantification of CUB and its statistical significance, useful for determining comparative magnitudes and patterns of biased codon usage for genes or genomes with diverse sequence compositions.

References

  1. Proc Natl Acad Sci U S A. 1997 Jul 22;94(15):7784-90 [PMID: 9223264]
  2. Nucleic Acids Res. 1987 Feb 11;15(3):1281-95 [PMID: 3547335]
  3. Mol Biol Evol. 2004 Sep;21(9):1719-26 [PMID: 15201397]
  4. Biosci Biotechnol Biochem. 2005 Aug;69(8):1595-602 [PMID: 16116291]
  5. Genome Res. 2002 Jun;12(6):944-55 [PMID: 12045147]
  6. Biol Direct. 2010 Nov 08;5:63 [PMID: 21059261]
  7. J Mol Evol. 1998 Sep;47(3):268-74 [PMID: 9732453]
  8. J Mol Biol. 1981 Sep 25;151(3):389-409 [PMID: 6175758]
  9. Genetics. 2009 Oct;183(2):651-62, 1SI-23SI [PMID: 19620398]
  10. Mol Biol Evol. 2009 Feb;26(2):451-61 [PMID: 19033257]
  11. Proc Natl Acad Sci U S A. 2004 Mar 9;101(10):3480-5 [PMID: 14990797]
  12. Mol Biol Evol. 2011 Jan;28(1):771-80 [PMID: 20855431]
  13. Evol Bioinform Online. 2007 May 17;3:53-8 [PMID: 19461982]
  14. Gene. 1997 Dec 31;205(1-2):269-78 [PMID: 9461401]
  15. Genetics. 1995 Feb;139(2):1067-76 [PMID: 7713409]
  16. Nucleic Acids Res. 1982 Nov 25;10(22):7055-74 [PMID: 6760125]
  17. Res Microbiol. 2007 May;158(4):363-70 [PMID: 17449227]
  18. Mol Biol Evol. 2011 Jan;28(1):211-21 [PMID: 20679093]
  19. Genomics Proteomics Bioinformatics. 2011 Apr;9(1-2):21-9 [PMID: 21641559]
  20. Nucleic Acids Res. 1981 Dec 21;9(24):6647-68 [PMID: 7038627]
  21. Annu Rev Genet. 2008;42:287-99 [PMID: 18983258]
  22. BMC Evol Biol. 2008 Nov 04;8:307 [PMID: 18983655]
  23. Biochem Biophys Res Commun. 2003 Jun 27;306(2):408-15 [PMID: 12804578]
  24. Nature. 2007 Nov 8;450(7167):233-7 [PMID: 17994089]
  25. Curr Biol. 2010 Mar 23;20(6):506-12 [PMID: 20226671]
  26. Proc Natl Acad Sci U S A. 1999 Apr 13;96(8):4482-7 [PMID: 10200288]
  27. Genetics. 2001 Nov;159(3):1191-9 [PMID: 11729162]
  28. Res Microbiol. 2010 Dec;161(10):838-46 [PMID: 20868744]
  29. Gene. 2001 Oct 3;276(1-2):47-56 [PMID: 11591471]
  30. Genetics. 1991 Nov;129(3):897-907 [PMID: 1752426]
  31. Biochemistry. 1995 Jan 10;34(1):115-21 [PMID: 7529559]
  32. Mol Microbiol. 1998 Sep;29(6):1341-55 [PMID: 9781873]
  33. Nature. 2003 Oct 16;425(6959):737-41 [PMID: 14562106]
  34. Curr Issues Mol Biol. 2001 Oct;3(4):91-7 [PMID: 11719972]
  35. Proc Natl Acad Sci U S A. 2002 Jul 23;99(15):9697-702 [PMID: 12119387]
  36. J Mol Evol. 2003 Jun;56(6):691-701 [PMID: 12911032]
  37. Curr Opin Microbiol. 1998 Oct;1(5):598-610 [PMID: 10066522]
  38. FEMS Microbiol Lett. 1999 Nov 15;180(2):345-9 [PMID: 10556732]
  39. Genome Res. 2002 Jun;12(6):851-6 [PMID: 12045139]
  40. DNA Res. 2008 Dec;15(6):357-65 [PMID: 18940873]
  41. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W132-6 [PMID: 17537810]
  42. Mol Biol Evol. 2011 May;28(5):1731-43 [PMID: 21191087]
  43. Nucleic Acids Res. 2003 Dec 1;31(23):6976-85 [PMID: 14627830]
  44. Gene. 1990 Mar 1;87(1):23-9 [PMID: 2110097]
  45. Yeast. 2000 Sep 15;16(12):1131-45 [PMID: 10953085]
  46. BMC Bioinformatics. 2005 Jul 19;6:182 [PMID: 16029499]
  47. Cell. 1998 Nov 25;95(5):717-28 [PMID: 9845373]
  48. J Mol Evol. 1993 Sep;37(3):273-80 [PMID: 8230251]
  49. Proc Natl Acad Sci U S A. 1998 Mar 31;95(7):3720-5 [PMID: 9520433]
  50. Nucleic Acids Res. 2005 Feb 23;33(4):1141-53 [PMID: 15728743]
  51. Biochem Biophys Res Commun. 2005 Feb 4;327(1):4-7 [PMID: 15629421]
  52. Nat Rev Genet. 2011 Jan;12(1):32-42 [PMID: 21102527]
  53. Genetics. 2008 Apr;178(4):2093-104 [PMID: 18430935]
  54. BMC Evol Biol. 2007 Nov 15;7:226 [PMID: 18005411]
  55. Mol Biol Evol. 2006 Dec;23(12):2303-15 [PMID: 16936139]
  56. Mol Biol Evol. 1988 Nov;5(6):704-16 [PMID: 3146682]
  57. Nucleic Acids Res. 1986 Jul 11;14(13):5125-43 [PMID: 3526280]
  58. Mol Biol Evol. 2002 May;19(5):728-35 [PMID: 11961106]
  59. Mol Biol Evol. 2002 Aug;19(8):1390-4 [PMID: 12140252]
  60. Mol Biol Evol. 2007 Feb;24(2):513-21 [PMID: 17119011]
  61. Nature. 2002 Aug 29;418(6901):975-9 [PMID: 12214599]

MeSH Term

Amino Acids
Animals
Arabidopsis
Base Composition
Biological Evolution
Codon
Computer Simulation
Escherichia coli
Genome
Mutation
Saccharomyces cerevisiae
Selection, Genetic
Statistics as Topic

Chemicals

Amino Acids
Codon

Word Cloud

Similar Articles

Cited By