项目编号 PRJCA011077
项目标题 TAGET
涉及领域 Medical
数据类型 Raw sequence reads
物种名称 Homo sapiens
描述信息 Single-molecule Real-time Isoform Sequencing (Iso-sSeq) of transcriptomes by PacBio can generate very long and accurate reads, thus providing an ideal platform for full-length transcriptome analysis. This sequencing technological breakthrough requires developing novel computational tools that can fully utilizing the benefits provided by Iso-seq. Here, we present an integrated computational toolkit named TAGET for Iso-seq full-length transcript data analyses, including transcript alignment, annotation, gene fusion detection, as well as transcript and quantification analyses such as differential expression gene analysis and differential isoform usage (DIU) analysis. We evaluated the performance of TAGET using a public Iso-seq dataset and four pairs of newly sequenced Iso-seq datasets from tumor and matched normal tissues. We found that TAGET achieved superior or similar performances in comparison with available methods. Especially, TAGET gave significantly more precise novel splicing splice site prediction and thus enabled more accurate novel transcript isoform and gene fusion discoveries, which were validated by our experiments. Experimental validation demonstrated the high precision of TAGET for identifying novel transcripts and gene fusions. In the paired laryngocarcinoma samples, we identified and experimentally validated a DIU gene ECM1. ECM1 was shown to be an oncogene, but its isoform ECM1b might be a tumor-suppressor in laryngocarcinoma. Finally, we evaluated the performance of TAGET on Oxford Nanopore Technologies (ONT) datasets, elucidating its broad applicability. Our results demonstrate that TAGET provides a valuable computational toolkit and can be applied to many full-length transcriptome studies.
样品范围 Monoisolate
发布日期 2022-08-10
出版信息
PubMed ID 文章标题 杂志名称 Doi 发表年份
37741817 TAGET: a toolkit for analyzing full-length transcripts from long-read sequencing Nature Communications 10.1038/s41467-023-41649-0 2023
项目资金来源
机构 项目类型 授权项目ID 授权项目名称
National Natural Science Foundation of China (NSFC) 2015CB856000
提交者 Ruibin Xi (ruibinxi@math.pku.edu.cn)
提交单位 Peking University
提交日期 2022-08-08

项目包含数据信息

资源名称 描述
BioSample (10)  show -