Help

1. Database Overview

Transposable elements (TEs) constitute nearly half of the human genome. Normally silenced in somatic tissues, TEs can become aberrantly activated in cancer, contributing to tumor progression and oncogenesis. Altered TE expression is increasingly recognized as a hallmark of cancer and a promising target for immunotherapy. However, due to their repetitive nature, TEs are often excluded from gene-centric single-cell analyses, limiting our understanding of TE dynamics in cancer. To address this gap, we developed TE-SCALE, a dedicated database for systematic analysis and visualization of TE expression pattern across human cancers at single-cell resolution.

To our knowledge, TE-SCALE represents the first pan-cancer single-cell atlas of TE expression. It is constructed from publicly available scRNA-seq datasets comprising over 1.3 million high-quality cells from 330 samples across 20 cancer types and 12 tissue origins. Powered by a streamlined and standardized computational pipeline, TE-SCALE enables robust quantification of 1,051 curated TE subfamilies, integrates both gene and TE expression profiles, and supports accurate cell-type annotation. The platform facilitates comprehensive analyses of TE activity, including TE expression dynamics across cell types, identification of tumor-specific TEs, exploration of TE-gene co-expression networks and their functional roles within the tumor microenvironment.

TE-SCALE serves as a critical resource for discovering TE-derived candidate biomarkers and immunotherapy targets, advancing cancer research through enhanced insights into TE dysregulation and enabling TE-based diagnostic and therapeutic innovation.

2. Home Page

The Home page comprises several key modules, including database functional navigation, quick search by cancer type/tissue origin/TE subfamily, core statistics, highlights, key analyses, related resources, updates, citation and contact information. To facilitate broader accessibility, particularly for users with backgrounds in single-cell biology, cancer biology, or clinical research who may not be extensively acquainted with TEs, a concise introduction to TEs is also provided on the Home page.

home

3. Tissue Map Page

The Tissue Map page presents a UMAP visualization of 12 integrated tissue origins. It provides an interactive analytical framework that supports multi-level data exploration—from pan-cancer comparisons to tissue-, sample-, and cell-level resolution—allowing researchers to investigate TE expression in diverse biological contexts.

tissue map

Upon selecting a tissue of interest, users enter the Tissue Explorer, which includes the following components:

(1) UMAP Explorer
Interactive UMAP plots display both feature expression (e.g., specific TEs or genes) and metadata attributes (e.g., tissue type, cancer type, disease state), enabling intuitive visual comparison across different conditions.

i. Expression: select a TE, e.g. MER127

ii. Expression: select a gene

iii. Metadata, e.g. cell type

tissue map UMAP explorer

(2) Tumor-specific TEs
For each tissue, TE-SCALE highlights tumor-specific TE subfamilies that are differentially expressed between tumor and normal cells within each cancer type. These TEs provide candidate biomarkers with potential translational value.

tumor-specific TE

(3) Differential TE Expression across Tumor Stages
TE-SCALE also performs differential TE expression analysis across tumor stages within a single cancer subtype for datasets with clinical stage information.

differential TE expression across tumor stages

(4) TE-gene Co-expression in Tumor Cells
TE-SCALE constructs TE-gene co-expression networks using a WGCNA approach and identifies functional gene modules specific to tumor cells. This enables users to explore regulatory relationships and tumor-specific expression programs. The potential biological roles of TEs are inferred from their co-expressed genes within the same module.

TE-gene co-expression

4. Search Page

TE-SCALE provides several search channels as follows: quick search on the Home page, advanced search for sample/TE on the Search page, and global fuzzy search embedded on the Browse page.

Quick search provides users with real-time querying by cancer type, tissue origin, or TE subfamily. The search returns corresponding samples along with their analysis results and the overall expression patterns of the selected TE.

4.2. Advanced Search by Samples

There are five modules that users can utilize to seek for samples of interest. Sample name-, Tissue origin-, Cancer type-, Project- and Platform- options can be used in composite searching mode. Each result in the table includes a link that directs users to the corresponding detailed sample page.

search by samples

4.3. Advanced Search by TEs

There are four modules that users can utilize to seek for TEs of interest. TE subfamily-, Class- and Family- options can be used in composite searching mode. TE-associated genes option can only be used independently. Each result in the table includes a link that directs users to the corresponding TE information page.

search by TEs

There are four modules that users can utilize to seek for TEs of interest. TE subfamily-, Class- and Family- options can be used in composite searching mode. TE-associated genes option can only be used independently. Each result in the table includes a link that directs users to the corresponding TE information page.

5. Browse Page

5.1. Browse Samples

The Sample Browse page has an interactive table featuring key messages for all recruited scRNA-seq datasets. Among which, filtered cell count is reported as the fundamental basis for downstream analysis, project accession links direct to the original data source, sample links to a detailed page that includes a curated metadata table of the given sample and their TE transcriptomic analysis results, and reference links to the related publication(s).

browse samples

(1) Presentation of Sample Details

sample details

(2) TE Transcriptome Analysis Results
There are six major modules for browsing transcriptome analytical results: cell type composition, TE expression distribution, UMAP visualization of cell clustering, CNV inference, differential expression, and GSVA enrichment analysis.

TE transcriptome analysis results 1 TE transcriptome analysis results 2

5.2. Browse TEs

The TE Browse page has an interactive table featuring key information for all TE subfamilies identified by scTEfinder in our datasets. Among which, Dfam accession links to the corresponding Dfam entry, and TE links to a detailed page containing annotation data, summary statistics and overall expression profiles across diverse biological contexts. Analyses of TE-correlated genes are also available, offering valuable data that may aid in elucidating TE-gene regulatory relationships and functional roles.

browse TEs

(1) Presentation of TE Basic Information

TE details

(2) TE Expression and Genomic Features
There are six major modules for browsing TE results: genomic locus distribution, age and length distribution, genomic region distribution, TE expression across tissues, TE expression across cell types, and TE-gene correlation analysis across different tissues and cell types.

TE expression and genomic features 1 TE expression and genomic features 2

(3) TE-Gene Co-expression Modules in Cancer
This module summarizes TE co-expression modules across various cancer types, providing a comprehensive view of TE regulatory landscapes.

TE-gene co-expression modules in cancer

6. Pan Cancer Page

The Pan Cancer page presents comprehensive analyses of TE expression at single-cell resolution. It includes detailed statistics on sample distribution, scRNA-seq data sources, cell-type annotations, cell compositions, and TE expression characteristics. A key feature is the systematic identification of tumor-specific TE subfamilies across diverse cancer types, encompassing both previously reported and novel elements. Many of these TEs display strong specificity for particular cancer types or disease states and perform comparably to established cancer marker genes in distinguishing tumor from normal cells, thereby offering valuable candidates for diagnostic biomarker discovery and clinical immunotherapy targeting. In addition, TE-SCALE incorporates a curated collection of TE-related clinical studies, covering applications such as vaccines, antibody therapies, cell-based therapies, and neoantigen identification.

pan cancer

7. Tools Page

TE-SCALE features scTEfinder, a robust and standardized pipeline developed for accurate quantification of TE subfamilies from raw scRNA-seq data. This pipeline enables comprehensive downstream analyses, including cell-type annotation, clustering, TE differential expression, TE-gene co-expression analysis, and functional enrichment.

The platform provides precomputed analysis results for all datasets using scTEfinder, while also supporting offline use for custom datasets. It is compatible with scRNA-seq data generated from all widely adopted 10x Genomics library chemistries, including 3' v1/v2/v3 and 5' v1/v2/v3 protocols. Researchers can apply scTEfinder to their own data to ensure reproducible and high-quality TE quantification and downstream functional analyses.

tools

8. Download Page

The Download page provides access to Seurat objects generated by the standardized scTEfinder pipeline. Each object contains both gene and TE expression count matrices, along with comprehensive metadata for downstream analysis.

download

9. Case Study

HERVK13-int is a subfamily of HERV-K, and the HML-2 group within HERV-K has been experimentally shown to be upregulated in lung cancer cell lines and patient blood samples at the transcriptional level. On the Lung subpage of Tissue Map, researchers can systematically explore its tumor-specific activation and molecular context in lung adenocarcinoma (LUAD).

(1) Identify Tumor-specific Activation
The Tumor-Specific TEs table highlights HERVK13-int as significantly upregulated in tumor cells of LUAD compared with normal epithelial cells.

identify tumor-specific activation

(2) Visualize Single-cell Expression
Using the UMAP Explorer, researchers can visualize the integrated single-cell atlas of lung cancers, showing cell clustering by cell type alongside HERVK13-int expression. The TE-level UMAP reveals strong HERVK13-int expression in primary tumor cells, with elevated expression also detected in a subset of macrophages.

visualize single-cell expression

(3) Examine Expression across Tumor Stages
The Differential TE Expression across Tumor Stages table and accompanying heatmap demonstrate sustained high expression of HERVK13-int in stage I tumor cells, suggesting activation early in tumor progression.

examine expression across tumor stages

(4) Explore TE-gene Co-expression Modules
In the TE-Gene Co-expression in Tumor Cells section, WGCNA identifies 14 gene-TE co-expression modules. HERVK13-int is included in Tumor-M3, which contains ~18.9% cancer-related genes.

explore TE-gene co-expression modules 1 explore TE-gene co-expression modules 2 explore TE-gene co-expression modules 3

Researchers can further investigate genes correlated with a specific TE using the TE-Gene Correlation feature on the TE detail page.

explore TE-gene co-expression modules 4

(5) Assess Functional Enrichment
Functional enrichment analysis of Tumor-M3 reveals significant involvement of TNFα-NFκB signaling, the p53 pathway, and apoptosis, consistent with early immune-related activation in tumor cells.

assess functional enrichment

Tumor-specific activation of HERVK13-int in early-stage LUAD highlights it as a potential early diagnostic biomarker and suggests a role in tumor-immune interactions.

If you have any question or comment, please feel free to contact us by te-scale@cncb.ac.cn.