A continuous-index Bayesian hidden Markov model for prediction of nucleosome positioning in genomic DNA.

Ritendranath Mitra, Mayetri Gupta
Author Information
  1. Ritendranath Mitra: Department of Biostatistics, The University of Texas M. D. Anderson Cancer Center, Houston, TX 77230, USA.

Abstract

Nucleosomes are units of chromatin structure, consisting of DNA sequence wrapped around proteins called "histones." Nucleosomes occur at variable intervals throughout genomic DNA and prevent transcription factor (TF) binding by blocking TF access to the DNA. A map of nucleosomal locations would enable researchers to detect TF binding sites with greater efficiency. Our objective is to construct an accurate genomic map of nucleosome-free regions (NFRs) based on data from high-throughput genomic tiling arrays in yeast. These high-volume data typically have a complex structure in the form of dependence on neighboring probes as well as underlying DNA sequence, variable-sized gaps, and missing data. We propose a novel continuous-index model appropriate for non-equispaced tiling array data that simultaneously incorporates DNA sequence features relevant to nucleosome formation. Simulation studies and an application to a yeast nucleosomal assay demonstrate the advantages of using the new modeling framework, as well as its robustness to distributional misspecifications. Our results reinforce the previous biological hypothesis that higher-order nucleotide combinations are important in distinguishing nucleosomal regions from NFRs.

References

  1. PLoS Comput Biol. 2008 Jan;4(1):e13 [PMID: 18225943]
  2. Proc Natl Acad Sci U S A. 1987 Apr;84(8):2363-7 [PMID: 3470801]
  3. Biometrics. 2007 Sep;63(3):797-805 [PMID: 17825011]
  4. Nat Rev Genet. 2009 Oct;10(10):669-80 [PMID: 19736561]
  5. PLoS Comput Biol. 2007 Nov;3(11):e215 [PMID: 17997593]
  6. Nat Genet. 2007 Oct;39(10):1235-44 [PMID: 17873876]
  7. Science. 2005 Jul 22;309(5734):626-30 [PMID: 15961632]
  8. Bioinformatics. 2007 Apr 15;23(8):1006-14 [PMID: 17309894]
  9. Genomics. 2004 Mar;83(3):349-60 [PMID: 14986705]
  10. Proc Natl Acad Sci U S A. 1992 Feb 1;89(3):1095-9 [PMID: 1736292]
  11. Nature. 2003 May 8;423(6936):145-50 [PMID: 12736678]
  12. Genome Res. 2007 Aug;17(8):1170-7 [PMID: 17620451]
  13. PLoS Genet. 2006 Sep 22;2(9):e158 [PMID: 17002501]
  14. Proc Natl Acad Sci U S A. 2003 Mar 18;100(6):3339-44 [PMID: 12626739]
  15. Mol Cell. 2005 Jun 10;18(6):735-48 [PMID: 15949447]
  16. Nature. 2006 Aug 17;442(7104):772-8 [PMID: 16862119]

Grants

  1. HG004946/NHGRI NIH HHS

MeSH Term

Bayes Theorem
Binding Sites
DNA, Fungal
Markov Chains
Models, Genetic
Monte Carlo Method
Nucleosomes
Oligonucleotide Array Sequence Analysis
Yeasts

Chemicals

DNA, Fungal
Nucleosomes

Word Cloud

Created with Highcharts 10.0.0DNAgenomicdatasequenceTFnucleosomalNucleosomesstructurebindingmapregionsNFRstilingyeastwellcontinuous-indexmodelnucleosomeunitschromatinconsistingwrappedaroundproteinscalled"histones"occurvariableintervalsthroughoutpreventtranscriptionfactorblockingaccesslocationsenableresearchersdetectsitesgreaterefficiencyobjectiveconstructaccuratenucleosome-freebasedhigh-throughputarrayshigh-volumetypicallycomplexformdependenceneighboringprobesunderlyingvariable-sizedgapsmissingproposenovelappropriatenon-equispacedarraysimultaneouslyincorporatesfeaturesrelevantformationSimulationstudiesapplicationassaydemonstrateadvantagesusingnewmodelingframeworkrobustnessdistributionalmisspecificationsresultsreinforcepreviousbiologicalhypothesishigher-ordernucleotidecombinationsimportantdistinguishingBayesianhiddenMarkovpredictionpositioning

Similar Articles

Cited By