|Full name:||the landscape of long noncoding RNAs in the human transcriptome|
|Description:||MiTranscriptome is a catalog of human long poly-adenylated RNA transcripts derived from computational analysis of high-throughput RNA sequencing (RNA-seq) data from over 6,500 samples spanning diverse cancer and tissue types. Among the complete catalog of over 91,000 genes, the majority are previously uncharacterized lncRNAs. Gene expression analysis of the transcripts revealed numerous cancer-specific and lineage-specific RNAs.|
|University/Institution:||University of Michigan|
|Address:||Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, Michigan, USA|
|Contact name (PI/Team):||Arul M Chinnaiyan|
|Contact email (PI/Helpdesk):||firstname.lastname@example.org|
The landscape of long noncoding RNAs in the human transcriptome. [PMID: 25599403]
Long noncoding RNAs (lncRNAs) are emerging as important regulators of tissue physiology and disease processes including cancer. To delineate genome-wide lncRNA expression, we curated 7,256 RNA sequencing (RNA-seq) libraries from tumors, normal tissues and cell lines comprising over 43 Tb of sequence from 25 independent studies. We applied ab initio assembly methodology to this data set, yielding a consensus human transcriptome of 91,013 expressed genes. Over 68% (58,648) of genes were classified as lncRNAs, of which 79% were previously unannotated. About 1% (597) of the lncRNAs harbored ultraconserved elements, and 7% (3,900) overlapped disease-associated SNPs. To prioritize lineage-specific, disease-associated lncRNA expression, we employed non-parametric differential expression testing and nominated 7,942 lineage- or cancer-associated lncRNA genes. The lncRNA landscape characterized here may shed light on normal biology and cancer pathogenesis and may be valuable for future biomarker development.