Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

CGHub

General information

URL: https://cghub.ucsc.edu/
Full name: Cancer Genomics Hub
Description: The Cancer Genomics Hub (CGHub) is a secure repository for storing, cataloging, and accessing cancer genome sequences, alignments, and mutation information from the Cancer Genome Atlas (TCGA) consortium and related projects.
Year founded: 2014
Last update: NA
Version: v1.0
Accessibility:
Accessible
Country/Region: United States

Classification & Tag

Data type:
Data object:
Database category:
Major species:
Keywords:

Contact information

University/Institution: University of California Santa Cruz
Address: Santa Cruz,CA, USA
City: Santa Cruz
Province/State: CA
Country/Region: United States
Contact name (PI/Team): Christopher Wilks
Contact email (PI/Helpdesk): cwilks@soe.ucsc.edu

Publications

25267794
The Cancer Genomics Hub (CGHub): overcoming cancer through the power of torrential data. [PMID: 25267794]
Wilks C, Cline MS, Weiler E, Diehkans M, Craft B, Martin C, Murphy D, Pierce H, Black J, Nelson D, Litzinger B, Hatton T, Maltbie L, Ainsworth M, Allen P, Rosewood L, Mitchell E, Smith B, Warner J, Groboske J, Telc H, Wilson D, Sanford B, Schmidt H, Haussler D, Maltbie D.

The Cancer Genomics Hub (CGHub) is the online repository of the sequencing programs of the National Cancer Institute (NCI), including The Cancer Genomics Atlas (TCGA), the Cancer Cell Line Encyclopedia (CCLE) and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) projects, with data from 25 different types of cancer. The CGHub currently contains >1.4?PB of data, has grown at an average rate of 50?TB a month and serves >100?TB per week. The architecture of CGHub is designed to support bulk searching and downloading through a Web-accessible application programming interface, enforce patient genome confidentiality in data storage and transmission and optimize for efficiency in access and transfer. In this article, we describe the design of these three components, present performance results for our transfer protocol, GeneTorrent, and finally report on the growth of the system in terms of data stored and transferred, including estimated limits on the current architecture. Our experienced-based estimates suggest that centralizing storage and computational resources is more efficient than wide distribution across many satellite labs. Database URL: https://cghub.ucsc.edu. Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.

Database (Oxford). 2014:2014() | 107 Citations (from Europe PMC, 2025-12-20)

Ranking

All databases:
1430/6895 (79.275%)
Health and medicine:
347/1738 (80.092%)
Metadata:
138/719 (80.946%)
Standard ontology and nomenclature:
69/238 (71.429%)
1430
Total Rank
106
Citations
9.636
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2015-06-20
Curated by:
Mengwei Li [2016-04-12]
Mengwei Li [2015-12-02]
Mengwei Li [2015-06-27]