Predicting nucleosome positioning using a duration Hidden Markov Model.

Liqun Xi, Yvonne Fondufe-Mittendorf, Lei Xia, Jared Flatow, Jonathan Widom, Ji-Ping Wang
Author Information
  1. Liqun Xi: Department of Statistics, Northwestern University, Evanston, IL 60208, USA.

Abstract

BACKGROUND: The nucleosome is the fundamental packing unit of DNAs in eukaryotic cells. Its detailed positioning on the genome is closely related to chromosome functions. Increasing evidence has shown that genomic DNA sequence itself is highly predictive of nucleosome positioning genome-wide. Therefore a fast software tool for predicting nucleosome positioning can help understanding how a genome's nucleosome organization may facilitate genome function.
RESULTS: We present a duration Hidden Markov model for nucleosome positioning prediction by explicitly modeling the linker DNA length. The nucleosome and linker models trained from yeast data are re-scaled when making predictions for other species to adjust for differences in base composition. A software tool named NuPoP is developed in three formats for free download.
CONCLUSIONS: Simulation studies show that modeling the linker length distribution and utilizing a base composition re-scaling method both improve the prediction of nucleosome positioning regarding sensitivity and false discovery rate. NuPoP provides a user-friendly software tool for predicting the nucleosome occupancy and the most probable nucleosome positioning map for genomic sequences of any size. When compared with two existing methods, NuPoP shows improved performance in sensitivity.

References

  1. PLoS Comput Biol. 2008 Jan;4(1):e13 [PMID: 18225943]
  2. Bioinformatics. 2009 Jun 15;25(12):i348-55 [PMID: 19478009]
  3. Genome Res. 2008 Jul;18(7):1051-63 [PMID: 18477713]
  4. Nat Genet. 2004 Aug;36(8):900-5 [PMID: 15247917]
  5. Nat Genet. 2007 Oct;39(10):1235-44 [PMID: 17873876]
  6. Bioinformatics. 2008 Jun 15;24(12):1456-8 [PMID: 18445607]
  7. Cell. 2008 Mar 7;132(5):887-98 [PMID: 18329373]
  8. PLoS Comput Biol. 2008 Nov;4(11):e1000216 [PMID: 18989395]
  9. Genome Res. 2007 Aug;17(8):1170-7 [PMID: 17620451]
  10. Nature. 2006 Aug 17;442(7104):772-8 [PMID: 16862119]
  11. Nat Struct Mol Biol. 2006 Jul;13(7):633-40 [PMID: 16819518]
  12. PLoS Comput Biol. 2008 Sep 12;4(9):e1000175 [PMID: 18787693]
  13. Nucleic Acids Res. 2005 Dec 09;33(21):6743-55 [PMID: 16339114]
  14. Nature. 2009 Mar 19;458(7236):362-6 [PMID: 19092803]
  15. Biometrika. 2010 Sep;97(3):727-740 [PMID: 22822253]
  16. J Mol Biol. 1996 Sep 20;262(2):129-39 [PMID: 8831784]

Grants

  1. HHSN266200400042C/PHS HHS
  2. U54CA143869/NCI NIH HHS

MeSH Term

DNA
Genome
Genomics
Markov Chains
Nucleosomes
Pattern Recognition, Automated
Software

Chemicals

Nucleosomes
DNA

Word Cloud

Created with Highcharts 10.0.0nucleosomepositioningsoftwaretoollinkerNuPoPgenomegenomicDNApredictingdurationHiddenMarkovpredictionmodelinglengthbasecompositionsensitivityBACKGROUND:fundamentalpackingunitDNAseukaryoticcellsdetailedcloselyrelatedchromosomefunctionsIncreasingevidenceshownsequencehighlypredictivegenome-wideThereforefastcanhelpunderstandinggenome'sorganizationmayfacilitatefunctionRESULTS:presentmodelexplicitlymodelstrainedyeastdatare-scaledmakingpredictionsspeciesadjustdifferencesnameddevelopedthreeformatsfreedownloadCONCLUSIONS:Simulationstudiesshowdistributionutilizingre-scalingmethodimproveregardingfalsediscoveryrateprovidesuser-friendlyoccupancyprobablemapsequencessizecomparedtwoexistingmethodsshowsimprovedperformancePredictingusingModel

Similar Articles

Cited By (73)