Accurate detection of differential RNA processing.

Philipp Drewe, Oliver Stegle, Lisa Hartmann, André Kahles, Regina Bohnert, Andreas Wachter, Karsten Borgwardt, Gunnar Rätsch
Author Information
  1. Philipp Drewe: Computational Biology Center, Sloan-Kettering Institute, 1275 York Avenue, New York, NY 10065, USA. drewe@cbio.mskcc.org

Abstract

Deep transcriptome sequencing (RNA-Seq) has become a vital tool for studying the state of cells in the context of varying environments, genotypes and other factors. RNA-Seq profiling data enable identification of novel isoforms, quantification of known isoforms and detection of changes in transcriptional or RNA-processing activity. Existing approaches to detect differential isoform abundance between samples either require a complete isoform annotation or fall short in providing statistically robust and calibrated significance estimates. Here, we propose a suite of statistical tests to address these open needs: a parametric test that uses known isoform annotations to detect changes in relative isoform abundance and a non-parametric test that detects differential read coverages and can be applied when isoform annotations are not available. Both methods account for the discrete nature of read counts and the inherent biological variability. We demonstrate that these tests compare favorably to previous methods, both in terms of accuracy and statistical calibrations. We use these techniques to analyze RNA-Seq libraries from Arabidopsis thaliana and Drosophila melanogaster. The identified differential RNA processing events were consistent with RT-qPCR measurements and previous studies. The proposed toolkit is available from http://bioweb.me/rdiff and enables in-depth analyses of transcriptomes, with or without available isoform annotation.

References

  1. Science. 1998 Feb 27;279(5355):1360-3 [PMID: 9478898]
  2. Bioinformatics. 2012 Jul 1;28(13):1721-8 [PMID: 22563066]
  3. Nat Methods. 2010 Dec;7(12):1009-15 [PMID: 21057496]
  4. Curr Protoc Bioinformatics. 2010 Dec;Chapter 11:Unit 11.6 [PMID: 21154708]
  5. Nature. 2008 Nov 27;456(7221):470-6 [PMID: 18978772]
  6. Bioinformatics. 2010 Jan 1;26(1):136-8 [PMID: 19855105]
  7. Genome Res. 2012 Oct;22(10):2008-17 [PMID: 22722343]
  8. Proc Natl Acad Sci U S A. 2003 Aug 5;100(16):9440-5 [PMID: 12883005]
  9. Nature. 2010 Sep 2;467(7311):103-7 [PMID: 20811459]
  10. Nucleic Acids Res. 2012 Nov 1;40(20):10073-83 [PMID: 22962361]
  11. Bioinformatics. 2011 Oct 1;27(19):2633-40 [PMID: 21824971]
  12. Bioinformatics. 2009 Dec 1;25(23):3056-9 [PMID: 19762346]
  13. Genome Res. 2011 Feb;21(2):193-202 [PMID: 20921232]
  14. Nucleic Acids Res. 2010 Jun;38(10):e112 [PMID: 20150413]
  15. Nat Biotechnol. 2010 May;28(5):511-5 [PMID: 20436464]
  16. Bioinformatics. 2009 Apr 15;25(8):1026-32 [PMID: 19244387]
  17. Bioinformatics. 2006 Jul 15;22(14):e49-57 [PMID: 16873512]
  18. Genome Biol. 2010;11(3):R25 [PMID: 20196867]
  19. Bioinformatics. 2007 Nov 1;23(21):2881-7 [PMID: 17881408]
  20. Nucleic Acids Res. 2010 Jul;38(Web Server issue):W348-51 [PMID: 20551130]
  21. Genome Res. 2008 Sep;18(9):1509-17 [PMID: 18550803]
  22. Genome Biol. 2010;11(10):R106 [PMID: 20979621]
  23. Nat Rev Genet. 2009 Jan;10(1):57-63 [PMID: 19015660]
  24. Bioinformatics. 2010 Jan 1;26(1):139-40 [PMID: 19910308]
  25. Bioinformatics. 2009 May 1;25(9):1105-11 [PMID: 19289445]
  26. Development. 1999 May;126(10):2073-82 [PMID: 10207133]
  27. Nat Methods. 2010 Dec;7(12):995-1001 [PMID: 21057495]
  28. Nat Methods. 2008 Jul;5(7):621-8 [PMID: 18516045]
  29. BMC Bioinformatics. 2010 Aug 10;11:422 [PMID: 20698981]

Grants

  1. U24 CA143840/NCI NIH HHS

MeSH Term

Algorithms
Animals
Arabidopsis
Data Interpretation, Statistical
Drosophila melanogaster
Gene Expression Profiling
Molecular Sequence Annotation
RNA Isoforms
RNA Processing, Post-Transcriptional
Reverse Transcriptase Polymerase Chain Reaction

Chemicals

RNA Isoforms

Word Cloud

Created with Highcharts 10.0.0isoformdifferentialRNA-SeqavailableisoformsknowndetectionchangesdetectabundanceannotationstatisticalteststestannotationsreadmethodspreviousRNAprocessingDeeptranscriptomesequencingbecomevitaltoolstudyingstatecellscontextvaryingenvironmentsgenotypesfactorsprofilingdataenableidentificationnovelquantificationtranscriptionalRNA-processingactivityExistingapproachessampleseitherrequirecompletefallshortprovidingstatisticallyrobustcalibratedsignificanceestimatesproposesuiteaddressopenneeds:parametricusesrelativenon-parametricdetectscoveragescanappliedaccountdiscretenaturecountsinherentbiologicalvariabilitydemonstratecomparefavorablytermsaccuracycalibrationsusetechniquesanalyzelibrariesArabidopsisthalianaDrosophilamelanogasteridentifiedeventsconsistentRT-qPCRmeasurementsstudiesproposedtoolkithttp://biowebme/rdiffenablesin-depthanalysestranscriptomeswithoutAccurate

Similar Articles

Cited By