Statistical inference of the rate of RNA polymerase II elongation by total RNA sequencing.

Yumi Kawamura, Shinsuke Koyama, Ryo Yoshida
Author Information
  1. Yumi Kawamura: Department of Statistical Science, The Graduate University for Advanced Studies (SOKENDAI), Tachikawa, Japan.
  2. Shinsuke Koyama: Department of Statistical Science, The Graduate University for Advanced Studies (SOKENDAI), Tachikawa, Japan.
  3. Ryo Yoshida: Department of Statistical Science, The Graduate University for Advanced Studies (SOKENDAI), Tachikawa, Japan.

Abstract

MOTIVATION: Sequencing total RNA without poly-A selection enables us to obtain a transcriptomic profile of nascent RNAs undergoing transcription with co-transcriptional splicing. In general, the RNA-seq reads exhibit a sawtooth pattern in a gene, which is characterized by a monotonically decreasing gradient across introns in the 5'-3' direction, and by substantially higher levels of RNA-seq reads present in exonic regions. Such patterns result from the process of underlying transcription elongation by RNA polymerase II, which traverses the DNA strand in a 5'-3' direction as it performs a complex series of mRNA synthesis and processing. Therefore, data of sequenced total RNAs could be utilized to infer the rate of transcription elongation by solving the inverse problem.
RESULTS: Though solving the inverse problem in total RNA-seq has the great potential, statistical methods have not yet been fully developed. We demonstrate what extent the newly developed method can be useful. The objective is to reconstruct the spatial distribution of transcription elongation rates in a gene from a given noisy, sawtooth-like profile. It is necessary to recover the signal source of the elongation rates separately from several types of nuisance factors, such as unobserved modes of co-transcriptionally occurring mRNA splicing, which exert significant influences on the sawtooth shape. The present method was tested using published total RNA-seq data derived from mouse embryonic stem cells. We investigated the spatial characteristics of the estimated elongation rates, focusing especially on the relation to promoter-proximal pausing of RNA polymerase II, nucleosome occupancy and histone modification patterns.
AVAILABILITY AND IMPLEMENTATION: A C implementation of PolSter and sample data are available at https://github.com/yoshida-lab/PolSter.
SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

References

  1. Cell. 2008 Aug 8;134(3):521-33 [PMID: 18692474]
  2. Nat Struct Mol Biol. 2009 Nov;16(11):1128-33 [PMID: 19820712]
  3. Proc Natl Acad Sci U S A. 2010 Dec 14;107(50):21931-6 [PMID: 21106759]
  4. Cell. 2011 Jan 7;144(1):16-26 [PMID: 21215366]
  5. Nature. 2011 Jan 20;469(7330):368-73 [PMID: 21248844]
  6. Cell. 2011 May 13;145(4):622-34 [PMID: 21549415]
  7. Nat Struct Mol Biol. 2011 Nov 06;18(12):1435-40 [PMID: 22056773]
  8. Nat Methods. 2012 Feb 28;9(3):215-6 [PMID: 22373907]
  9. Mol Cell. 2012 Jul 13;47(1):27-37 [PMID: 22658416]
  10. Nature. 2012 Aug 2;488(7409):116-20 [PMID: 22763441]
  11. Hum Mol Genet. 2012 Oct 15;21(R1):R90-6 [PMID: 22936691]
  12. Biochim Biophys Acta. 2013 Jan;1829(1):76-83 [PMID: 22982194]
  13. Nat Struct Mol Biol. 2012 Nov;19(11):1185-92 [PMID: 23085715]
  14. Proc Natl Acad Sci U S A. 2013 Feb 19;110(8):2876-81 [PMID: 23382218]
  15. Science. 2013 Feb 22;339(6122):950-3 [PMID: 23430654]
  16. Mol Cell. 2013 Apr 25;50(2):212-22 [PMID: 23523369]
  17. J Vis Exp. 2013 Aug 08;(78):null [PMID: 23963265]
  18. Nat Rev Genet. 2014 Mar;15(3):163-75 [PMID: 24514444]
  19. Elife. 2014 Apr 29;3:e02407 [PMID: 24843027]
  20. Trends Biochem Sci. 2014 Dec;39(12):577-86 [PMID: 25455758]
  21. Transcription. 2014;5(5):e988093 [PMID: 25494544]
  22. Nat Rev Mol Cell Biol. 2015 Mar;16(3):167-77 [PMID: 25693130]
  23. Nature. 2015 May 21;521(7552):376-9 [PMID: 25970244]
  24. Nature. 2015 May 21;521(7552):371-375 [PMID: 25970246]
  25. BMC Bioinformatics. 2015 Jul 16;16:222 [PMID: 26173492]
  26. Nat Struct Mol Biol. 2016 Mar;23(3):231-8 [PMID: 26878240]

MeSH Term

Animals
Mice
RNA
RNA Polymerase II
RNA Splicing
Sequence Analysis, RNA
Transcription, Genetic

Chemicals

RNA
RNA Polymerase II

Word Cloud

Created with Highcharts 10.0.0elongationtotalRNAtranscriptionRNA-seqdatapolymeraseIIratesprofileRNAssplicingreadssawtoothgene5'-3'directionpresentpatternsmRNAratesolvinginverseproblemdevelopedmethodspatialavailableMOTIVATION:Sequencingwithoutpoly-Aselectionenablesusobtaintranscriptomicnascentundergoingco-transcriptionalgeneralexhibitpatterncharacterizedmonotonicallydecreasinggradientacrossintronssubstantiallyhigherlevelsexonicregionsresultprocessunderlyingtraversesDNAstrandperformscomplexseriessynthesisprocessingThereforesequencedutilizedinferRESULTS:Thoughgreatpotentialstatisticalmethodsyetfullydemonstrateextentnewlycanusefulobjectivereconstructdistributiongivennoisysawtooth-likenecessaryrecoversignalsourceseparatelyseveraltypesnuisancefactorsunobservedmodesco-transcriptionallyoccurringexertsignificantinfluencesshapetestedusingpublishedderivedmouseembryonicstemcellsinvestigatedcharacteristicsestimatedfocusingespeciallyrelationpromoter-proximalpausingnucleosomeoccupancyhistonemodificationAVAILABILITYANDIMPLEMENTATION:CimplementationPolStersamplehttps://githubcom/yoshida-lab/PolSterSUPPLEMENTARYINFORMATION:SupplementaryBioinformaticsonlineStatisticalinferencesequencing

Similar Articles

Cited By