Introduction

Recently, ultra high-throughput sequencing of RNA (RNA-Seq) has been developed as an approach for analysis of gene expression. By obtaining tens or even hundreds of millions of reads of transcribed sequences, an RNA-Seq experiment can offer a comprehensive survey of the population of genes (transcripts) in any sample of interest. This paper introduces a statistical model for estimating isoform abundance from RNA-Seq data and is flexible enough to accommodate both single end and paired end RNA-Seq data and sampling bias along the length of the transcript. Based on the derivation of minimal sufficient statistics for the model, a computationally feasible implementation of the maximum likelihood estimator of the model is provided. Further, it is shown that using paired end RNA-Seq provides more accurate isoform abundance estimates than single end sequencing at fixed sequencing depth. Simulation studies are also given.

Publications

  1. Statistical Modeling of RNA-Seq Data.
    Cite this
    Salzman J, Jiang H, Wong WH, 2011-02-01 - Statistical science : a review journal of the Institute of Mathematical Statistics

Credits

  1. Julia Salzman
    Developer

    Research Associate in the Department of Statistics and Biochemistry, Stanford University, United States of America

  2. Hui Jiang
    Developer

  3. Wing Hung Wong
    Investigator

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT000343
Tool TypeApplication
Category
PlatformsLinux/Unix
Technologies
User InterfaceTerminal Command Line
Download Count0
Submitted ByWing Hung Wong