Deep learning architectures for prediction of nucleosome positioning from sequences data.

Mattia Di Gangi, Giosuè Lo Bosco, Riccardo Rizzo
Author Information
  1. Mattia Di Gangi: Fondazione Bruno Kessler, Via Sommarive, 18, Trento, 38123, Italy.
  2. Giosuè Lo Bosco: Dipartimento di Matematica e Informatica, Università degli studi di Palermo, Via Archirafi, 34, Palermo, 90123, Italy. giosue.lobosco@unipa.it.
  3. Riccardo Rizzo: CNR-ICAR, National Research Council of Italy, Via Ugo La Malfa, 153, Palermo, 90146, Italy.

Abstract

BACKGROUND: Nucleosomes are DNA-histone complex, each wrapping about 150 pairs of double-stranded DNA. Their function is fundamental for one of the primary functions of Chromatin i.e. packing the DNA into the nucleus of the Eukaryote cells. Several biological studies have shown that the nucleosome positioning influences the regulation of cell type-specific gene activities. Moreover, computational studies have shown evidence of sequence specificity concerning the DNA fragment wrapped into nucleosomes, clearly underlined by the organization of particular DNA substrings. As the main consequence, the identification of nucleosomes on a genomic scale has been successfully performed by computational methods using a sequence features representation.
RESULTS: In this work, we propose a deep learning model for nucleosome identification. Our model stacks convolutional layers and Long Short-term Memories to automatically extract features from short- and long-range dependencies in a sequence. Using this model we are able to avoid the feature extraction and selection steps while improving the classification performances.
CONCLUSIONS: Results computed on eleven data sets of five different organisms, from Yeast to Human, show the superiority of the proposed method with respect to the state of the art recently presented in the literature.

Keywords

References

  1. BMC Bioinformatics. 2018 Jul 9;19(Suppl 7):198 [PMID: 30066629]
  2. Brief Bioinform. 2016 Sep;17(5):745-57 [PMID: 26411474]
  3. Artif Intell Med. 2015 Jul;64(3):173-84 [PMID: 26170017]
  4. BMC Bioinformatics. 2010 Jun 24;11:346 [PMID: 20576140]
  5. Brief Bioinform. 2014 Nov;15(6):1014-27 [PMID: 24023366]
  6. Proc Natl Acad Sci U S A. 2010 Dec 7;107(49):20998-1003 [PMID: 21084631]
  7. Nat Struct Mol Biol. 2013 Mar;20(3):267-73 [PMID: 23463311]
  8. PLoS Comput Biol. 2008 Jan;4(1):e13 [PMID: 18225943]
  9. Cell. 1999 Aug 6;98(3):285-94 [PMID: 10458604]
  10. Nature. 2006 Aug 17;442(7104):772-8 [PMID: 16862119]
  11. PLoS Comput Biol. 2008 Nov;4(11):e1000216 [PMID: 18989395]
  12. Nat Genet. 2009 Apr;41(4):498-503 [PMID: 19252489]
  13. Nature. 2009 Mar 19;458(7236):362-6 [PMID: 19092803]
  14. Neural Comput. 1997 Nov 15;9(8):1735-80 [PMID: 9377276]
  15. BMC Bioinformatics. 2009 Nov 10;10 Suppl 14:S9 [PMID: 19900305]
  16. Wiley Interdiscip Rev Syst Biol Med. 2012 May-Jun;4(3):297-309 [PMID: 22344857]
  17. Brief Bioinform. 2014 May;15(3):419-30 [PMID: 24197932]
  18. EMBO J. 2011 May 4;30(9):1766-77 [PMID: 21448136]
  19. Bioinformatics. 2010 Mar 15;26(6):845-6 [PMID: 20106816]
  20. Bioinformatics. 2016 Jun 15;32(12):i121-i127 [PMID: 27307608]
  21. Bioinformatics. 2014 Jun 1;30(11):1522-9 [PMID: 24504871]
  22. Nature. 2015 May 28;521(7553):436-44 [PMID: 26017442]
  23. Science. 2009 Jul 31;325(5940):626-8 [PMID: 19644123]
  24. Trends Biochem Sci. 1997 Mar;22(3):93-7 [PMID: 9066259]
  25. BMC Bioinformatics. 2011 Oct 21;12:408 [PMID: 22017798]
  26. Nat Struct Mol Biol. 2009 Sep;16(9):996-1001 [PMID: 19684599]
  27. Proc Natl Acad Sci U S A. 2012 Sep 18;109(38):E2514-22 [PMID: 22908247]

MeSH Term

Animals
Base Sequence
Databases, Nucleic Acid
Deep Learning
Humans
Neural Networks, Computer
Nucleosomes
ROC Curve
Reproducibility of Results
Saccharomyces cerevisiae

Chemicals

Nucleosomes

Word Cloud

Created with Highcharts 10.0.0DNAnucleosomesequencelearningmodelstudiesshownpositioningcomputationalnucleosomesidentificationfeaturesclassificationdataDeepnetworksBACKGROUND:NucleosomesDNA-histonecomplexwrapping150pairsdouble-strandedfunctionfundamentaloneprimaryfunctionsChromatiniepackingnucleusEukaryotecellsSeveralbiologicalinfluencesregulationcelltype-specificgeneactivitiesMoreoverevidencespecificityconcerningfragmentwrappedclearlyunderlinedorganizationparticularsubstringsmainconsequencegenomicscalesuccessfullyperformedmethodsusingrepresentationRESULTS:workproposedeepstacksconvolutionallayersLongShort-termMemoriesautomaticallyextractshort-long-rangedependenciesUsingableavoidfeatureextractionselectionstepsimprovingperformancesCONCLUSIONS:ResultscomputedelevensetsfivedifferentorganismsYeastHumanshowsuperiorityproposedmethodrespectstateartrecentlypresentedliteraturearchitecturespredictionsequencesEpigeneticNucleosomeRecurrentneural

Similar Articles

Cited By (8)