Deep annotation of long noncoding RNAs by assembling RNA-seq and small RNA-seq data.

Jiaming Zhang, Weibo Hou, Qi Zhao, Songling Xiao, Hongye Linghu, Lixin Zhang, Jiawei Du, Hongdi Cui, Xu Yang, Shukuan Ling, Jianzhong Su, Qingran Kong
Author Information
  1. Jiaming Zhang: Oujiang Laboratory, Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, Zhejiang Province, China; Oujiang Laboratory, Zhejiang Lab for Regenerative Medicine, Vision and Brain Health, Wenzhou Medical University, Wenzhou, Zhejiang Province, China.
  2. Weibo Hou: Oujiang Laboratory, Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, Zhejiang Province, China.
  3. Qi Zhao: Oujiang Laboratory, Zhejiang Lab for Regenerative Medicine, Vision and Brain Health, Wenzhou Medical University, Wenzhou, Zhejiang Province, China.
  4. Songling Xiao: Oujiang Laboratory, Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, Zhejiang Province, China.
  5. Hongye Linghu: Oujiang Laboratory, Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, Zhejiang Province, China.
  6. Lixin Zhang: Oujiang Laboratory, Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, Zhejiang Province, China.
  7. Jiawei Du: Oujiang Laboratory, Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, Zhejiang Province, China.
  8. Hongdi Cui: Oujiang Laboratory, Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, Zhejiang Province, China.
  9. Xu Yang: Oujiang Laboratory, Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, Zhejiang Province, China.
  10. Shukuan Ling: Oujiang Laboratory, Zhejiang Lab for Regenerative Medicine, Vision and Brain Health, Wenzhou Medical University, Wenzhou, Zhejiang Province, China. Electronic address: sh2ling@126.com.
  11. Jianzhong Su: Oujiang Laboratory, Zhejiang Lab for Regenerative Medicine, Vision and Brain Health, Wenzhou Medical University, Wenzhou, Zhejiang Province, China. Electronic address: sujz@wmu.edu.cn.
  12. Qingran Kong: Oujiang Laboratory, Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, Zhejiang Province, China. Electronic address: kqr721726@163.com.

Abstract

Long noncoding RNAs (lncRNAs) are increasingly being recognized as modulators in various biological processes. However, due to their low expression, their systematic characterization is difficult to determine. Here, we performed transcript annotation by a newly developed computational pipeline, termed RNA-seq and small RNA-seq combined strategy (RSCS), in a wide variety of cellular contexts. Thousands of high-confidence potential novel transcripts were identified by the RSCS, and the reliability of the transcriptome was verified by analysis of transcript structure, base composition, and sequence complexity. Evidenced by the length comparison, the frequency of the core promoter and the polyadenylation signal motifs, and the locations of transcription start and end sites, the transcripts appear to be full length. Furthermore, taking advantage of our strategy, we identified a large number of endogenous retrovirus-associated lncRNAs, and a novel endogenous retrovirus-lncRNA that was functionally involved in control of Yap1 expression and essential for early embryogenesis was identified. In summary, the RSCS can generate a more complete and precise transcriptome, and our findings greatly expanded the transcriptome annotation for the mammalian community.

Keywords

MeSH Term

Animals
Embryonic Development
Mammals
Molecular Sequence Annotation
Promoter Regions, Genetic
Reproducibility of Results
Retroviridae
RNA, Long Noncoding
RNA-Seq
Transcription Initiation Site
Transcriptome
YAP-Signaling Proteins

Chemicals

RNA, Long Noncoding
YAP-Signaling Proteins

Word Cloud

Created with Highcharts 10.0.0annotationRNA-seqtranscriptometranscriptRSCSidentifiednoncodingRNAslncRNAsexpressionsmallstrategynoveltranscriptslengthendogenousLongincreasinglyrecognizedmodulatorsvariousbiologicalprocessesHoweverduelowsystematiccharacterizationdifficultdetermineperformednewlydevelopedcomputationalpipelinetermedcombinedwidevarietycellularcontextsThousandshigh-confidencepotentialreliabilityverifiedanalysisstructurebasecompositionsequencecomplexityEvidencedcomparisonfrequencycorepromoterpolyadenylationsignalmotifslocationstranscriptionstartendsitesappearfullFurthermoretakingadvantagelargenumberretrovirus-associatedretrovirus-lncRNAfunctionallyinvolvedcontrolYap1essentialearlyembryogenesissummarycangeneratecompleteprecisefindingsgreatlyexpandedmammaliancommunityDeeplongassemblingdataERV-lncRNAfull-lengthiPSCmouseembryodevelopment

Similar Articles

Cited By (1)