Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

PCTA

General information

URL: https://pcatools.shinyapps.io/PCTA_app
Full name: Pan-cancer Cell Line Transcriptome Atlas
Description: PCTA is a curated dataset comprising RNAseq data of 84,385 samples from 535 cell lines, representing 114 cancer types across 30 tissue origins. The dataset allows non-bioinformaticians to explore gene expression patterns through an interactive web application, enhancing the accessibility and utility of RNAseq data for cancer research.
Year founded: 2024
Last update: 2024-04-28
Version: v1.0
Accessibility:
Accessible
Country/Region: United States

Classification & Tag

Data type:
RNA
Data object:
Database category:
Major species:
Keywords:

Contact information

University/Institution: LSU Health Shreveport
Address:
City:
Province/State:
Country/Region: United States
Contact name (PI/Team): Siyuan Cheng
Contact email (PI/Helpdesk): siyuan.cheng@lsuhs.edu

Publications

38462036
PCTA, a pan-cancer cell line transcriptome atlas. [PMID: 38462036]
Siyuan Cheng, Lin Li, Xiuping Yu

A substantial volume of RNA sequencing data have been generated from cancer cell lines. However, it requires specific bioinformatics skills to compare gene expression levels across cell lines. This has hindered non-bioinformaticians from fully utilizing these valuable datasets in their research. To bridge this gap, we established a curated Pan-cancer Cell Line Transcriptome Atlas (PCTA) dataset. This resource aims to provide a user-friendly platform, allowing researchers without extensive bioinformatics expertise to access and leverage the wealth of information within the dataset for their studies. The PCTA dataset encompasses the expression matrix of 24,965 genes, featuring data from 84,385 samples derived from 5677 studies. This comprehensive compilation spans 535 cell lines, representing a spectrum of 114 cancer types originating from 30 diverse tissue types. On UMAP plots, cell lines originating from the same type of tissue tend to cluster together, illustrating the dataset's ability to capture biological relationships. Additionally, an interactive and user-friendly web application (https://pcatools.shinyapps.io/PCTA_app/) was developed for researchers to explore the PCTA dataset. This platform allows users to examine the expression of their genes of interest across a diverse array of samples.

Cancer Lett. 2024:588() | 8 Citations (from Europe PMC, 2026-03-28)
38260452
PCTA, A PAN-CANCER CELL LINE TRANSCRIPTOME ATLAS. [PMID: 38260452]
Siyuan Cheng, Lin Li, Xiuping Yu

BACKGROUND: A substantial volume of RNA sequencing data were generated from cancer cell lines. However, it requires specific bioinformatics skills to compare gene expression levels across cell lines. This has hindered non-bioinformaticians from fully utilizing these valuable datasets in their research. To bridge this gap, we established a curated Pan-cancer Cell Line Transcriptome Atlas (PCTA) dataset. This resource aims to provide a user-friendly platform, allowing researchers without extensive bioinformatics expertise to access and leverage the wealth of information within the dataset for their studies. Importantly, PCTA stands out by offering sufficient sample numbers per cell line in comparison to other pan-cancer datasets.
METHODS: Cell lines' meta data and RNA sequencing data were retrieved from the Cancer Cell Line Encyclopedia (CCLE), SRA and ARCHS4 databases. Utilizing the programming language R, we conducted data retrieval, normalization, and visualization. Only expression data for protein-coding genes and long-non-coding RNAs (LncRNAs) were considered in this study, streamlining the focus to enhance the precision and relevance of the analysis.
RESULTS: The resulting PCTA dataset encompasses the expression matrix of 24,965 genes, featuring data from 84,385 samples derived from 5,677 studies. This comprehensive compilation spans 535 cell lines, representing a spectrum of 114 cancer types originating from 30 diverse tissue types. On UMAP plots, cell lines originating from the same type of tissue tend to cluster together, illustrating the dataset's ability to capture biological relationships. To unravel molecular signatures, marker genes were identified for each cancer type. Additionally, an interactive and user-friendly web application (https://pcatools.shinyapps.io/PCTA_app/ ) was developed for researchers to explore the PCTA dataset. This platform allows users to examine the expression pattern of their genes of interest across a diverse array of samples. Data are visualized as violin-, box-, and point- plots, enhancing the interpretability of the findings.
CONCLUSION: The PCTA stands as a comprehensive resource, offering insights into gene expression patterns across diverse cancer cell lines and providing a valuable tool to explore molecular signatures and potential therapeutic targets in cancer research.

bioRxiv. 2024:() | 0 Citations (from Europe PMC, 2026-03-28)

Ranking

All databases:
2826/6932 (59.247%)
Gene genome and annotation:
876/2039 (57.087%)
Expression:
580/1361 (57.458%)
Health and medicine:
689/1755 (60.798%)
2826
Total Rank
7
Citations
3.5
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2024-07-16
Curated by:
Wenzhuo Cheng [2024-08-26]
Shiting Wang [2024-07-24]
Wenzhuo Cheng [2024-07-16]