Pan-cancer TE Landscape

Data Overview

Sample Statistics

TE-SCALE is constructed from 24 publicly available single-cell datasets, encompassing 1,317,803 high-quality cells from 232 donors and 330 tumor or adjacent normal samples across 20 cancer types. These samples are derived from 12 tissue types, including brain, breast, colon, kidney, liver, lung, ovary, pancreas, skin, small intestine, stomach, and thyroid gland.

sample statistics

Data Source

We curated raw scRNA-seq data and associated metadata for tumor and normal samples from five public repositories, including CancerSCEM v1, GEO, GSA, GSA-Human and ArrayExpress.

The database covers nearly all normal tissues corresponding to the included cancer types. In total, tumor samples comprise 73.9% of the data, while normal samples account for 26.1%.

data source

Cellular Characterization

Cell Type Annotation

We performed automated cell type annotation using CellTypist based on a custom-built cross-tissue reference model. The resulting labels span three hierarchical levels of increasing granularity.

  • Level 1 defines four major cell classes: epithelial cells, immune cells, stromal cells, and neuronal cells.
  • Level 2 refines these into 15 broad cell types.
  • Level 3 provides high-resolution annotations with 81 fine-grained cell types.

Level 1 and Level 3 annotations are integrated into the TE-SCALE database for user access and interactive exploration.

In addition, tumor cells were identified based on copy number variation (CNV) inference.

cell annotation

Cell Composition

We performed automated cell type annotation using CellTypist based on a custom-built cross-tissue reference model. The resulting labels span three hierarchical levels of increasing granularity.

cell composition

TE Characterization

Using scTEfinder, we identified a total of 1,051 TE subfamilies, spanning five major classes: DNA transposons, LINEs, LTRs, SINEs and Retrosopons (SVAs).

Compared to normal epithelial cells, both the number of TEs and their expression levels were significantly elevated in tumor cells, with this upregulation consistently observed across all four TE classes.

characterization of TE expression

Tumor-specific TE Identification

Tumor-specific TEs were identified via strict differential expression analysis (adjusted p value < 0.05) between tumor cells and normal epithelial cells across various tissues. Many of these TEs exhibited strong cancer-type specificity — for example, 148 TEs were found in only one cancer type.

differential expression 1

Among them, LTRs, especially the HERV family, contribute the majority of tumor-specific TE markers in many cancers, highlighting their potential for cancer diagnosis and immunotherapy.

differential expression 2 differential expression 3

There are also some tumor-specific TEs shared across cancers. Notably, nine TEs were detected in up to eight cancer types, almost all of which are members of the LINE1 (L1) family. Their activation in cancers suggests they may serve as pan-cancer biomarkers and therapeutic targets.

You can download all the tumor-specific TEs across cancer types here.

differential expression 4

Differential TE Expression across Tumor Stages

Over 40% of samples in our dataset include clinical tumor stage information. Differential expression analysis of tumor cells across stages within the same cancer type revealed distinct stage-specific TE expression patterns. These findings highlight the potential of TEs as stage-associated biomarkers and provide clues to their role in tumor progression.

differential expression 2 differential expression 3

Studies using TEs as immunotherapy targets

TEs are emerging as a key source of tumor-specific antigens (TSAs), especially in cancers with low mutation burden. TE-derived peptides have shown immunogenicity, and some TEs encode membrane-bound proteins that are targetable by antibody- or cell-based therapies. These features position TEs as promising candidates for biomarker discovery and cancer immunotherapy development.

No results.

If you have any question or comment, please feel free to contact us by te-scale@cncb.ac.cn.