A generalized hidden Markov model for determining sequence-based predictors of nucleosome positioning.

Carlee Moser, Mayetri Gupta
Author Information
  1. Carlee Moser: Boston University.

Abstract

Chromatin structure, in terms of positioning of nucleosomes and nucleosome-free regions in the DNA, has been found to have an immense impact on various cell functions and processes, ranging from transcriptional regulation to growth and development. In spite of numerous experimental and computational approaches being developed in the past few years to determine the intrinsic relationship between chromatin structure (nucleosome positioning) and DNA sequence features, there is yet no universally accurate approach to predict nucleosome positioning from the underlying DNA sequence alone. We here propose an alternative approach to predicting nucleosome positioning from sequence, making use of characteristic sequence differences, and inherent dependencies in overlapping sequence features. Our nucleosomal positioning prediction algorithm, based on the idea of generalized hierarchical hidden Markov models (HGHMMs), was used to predict nucleosomal state based on the DNA sequence in yeast chromosome III, and compared with two other existing methods. The HGHMM method performed favorably among the three models in terms of specificity and sensitivity, and provided estimates that were largely consistent with predictions from the method of Yuan and Liu (2008). However, all the methods still give higher than desirable misclassification rates, indicating that sequence-based features may provide only limited information towards understanding positioning of nucleosomes. The method is implemented in the open-source statistical software R, and is freely available from the authors' website.

References

  1. Annu Rev Biophys Biomol Struct. 2007;36:329-47 [PMID: 17311525]
  2. Proc Natl Acad Sci U S A. 2005 Apr 12;102(15):5501-6 [PMID: 15795371]
  3. Genome Res. 2007 Jun;17(6):947-53 [PMID: 17568010]
  4. Curr Opin Genet Dev. 2006 Apr;16(2):171-6 [PMID: 16503136]
  5. Nat Genet. 2007 Oct;39(10):1235-44 [PMID: 17873876]
  6. Proc Natl Acad Sci U S A. 2005 May 17;102(20):7079-84 [PMID: 15883375]
  7. Science. 2005 Jul 22;309(5734):626-30 [PMID: 15961632]
  8. Genomics Proteomics Bioinformatics. 2011 Apr;9(1-2):1-6 [PMID: 21641556]
  9. Nucleic Acids Res. 2005 Dec 09;33(21):6743-55 [PMID: 16339114]
  10. Trends Genet. 2005 Mar;21(3):138-42 [PMID: 15734572]
  11. Nature. 2006 Aug 17;442(7104):772-8 [PMID: 16862119]
  12. Genes Dev. 2005 May 15;19(10):1188-98 [PMID: 15905407]
  13. PLoS Comput Biol. 2008 Jan;4(1):e13 [PMID: 18225943]
  14. Nat Genet. 2006 Oct;38(10):1104-5 [PMID: 17006463]
  15. Bioessays. 1994 Mar;16(3):165-70 [PMID: 8166669]
  16. Biometrics. 2007 Sep;63(3):797-805 [PMID: 17825011]
  17. J Mol Biol. 2004 May 7;338(4):695-709 [PMID: 15099738]
  18. Genome Res. 2011 Nov;21(11):1863-71 [PMID: 21750105]
  19. Chromosome Res. 2006;14(1):5-16 [PMID: 16506092]
  20. PLoS Genet. 2006 Sep 22;2(9):e158 [PMID: 17002501]
  21. Genome Res. 2007 Jun;17(6):877-85 [PMID: 17179217]
  22. Nat Genet. 2006 Oct;38(10):1210-5 [PMID: 16964265]

Grants

  1. R03 HG004946/NHGRI NIH HHS
  2. HG004946/NHGRI NIH HHS

MeSH Term

Algorithms
Chromatin
DNA
Genome, Fungal
Markov Chains
Models, Statistical
Nucleosomes
ROC Curve
Reproducibility of Results
Saccharomyces cerevisiae

Chemicals

Chromatin
Nucleosomes
DNA

Word Cloud

Created with Highcharts 10.0.0positioningsequenceDNAnucleosomefeaturesmethodstructuretermsnucleosomesapproachpredictnucleosomalbasedgeneralizedhiddenMarkovmodelsmethodssequence-basedChromatinnucleosome-freeregionsfoundimmenseimpactvariouscellfunctionsprocessesrangingtranscriptionalregulationgrowthdevelopmentspitenumerousexperimentalcomputationalapproachesdevelopedpastyearsdetermineintrinsicrelationshipchromatinyetuniversallyaccurateunderlyingaloneproposealternativepredictingmakingusecharacteristicdifferencesinherentdependenciesoverlappingpredictionalgorithmideahierarchicalHGHMMsusedstateyeastchromosomeIIIcomparedtwoexistingHGHMMperformedfavorablyamongthreespecificitysensitivityprovidedestimateslargelyconsistentpredictionsYuanLiu2008Howeverstillgivehigherdesirablemisclassificationratesindicatingmayprovidelimitedinformationtowardsunderstandingimplementedopen-sourcestatisticalsoftwareRfreelyavailableauthors'websitemodeldeterminingpredictors

Similar Articles

Cited By