Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

ATAV

General information

URL: http://atavdb.org
Full name:
Description: The ATAV data browser is a web user interface that allows everyone within the network to access variant level data directly from the full data set in the ATAV database. It supports the search of variants by gene, region and variant ID. The gene or region view displays a list of variants with allele count, allele frequency, number of samples, effect, gene etc.
Year founded: 2021
Last update:
Version:
Accessibility:
Accessible
Country/Region: United States

Classification & Tag

Data type:
DNA
Data object:
Database category:
Major species:
Keywords:

Contact information

University/Institution: Institute for Genomic Medicine
Address:
City:
Province/State:
Country/Region: United States
Contact name (PI/Team): Nick Ren
Contact email (PI/Helpdesk): z.ren@columbia.edu

Publications

33757430
ATAV: a comprehensive platform for population-scale genomic analyses. [PMID: 33757430]
Zhong Ren, Gundula Povysil, Joseph A Hostyk, Hongzhu Cui, Nitin Bhardwaj, David B Goldstein

BACKGROUND: A common approach for sequencing studies is to do joint-calling and store variants of all samples in a single file. If new samples are continually added or controls are re-used for several studies, the cost and time required to perform joint-calling for each analysis can become prohibitive.
RESULTS: We present ATAV, an analysis platform for large-scale whole-exome and whole-genome sequencing projects. ATAV stores variant and per site coverage data for all samples in a centralized database, which is efficiently queried by ATAV to support diagnostic analyses for trios and singletons, as well as rare-variant collapsing analyses for finding disease associations in complex diseases. Runtime logs ensure full reproducibility and the modularized ATAV framework makes it extensible to continuous development. Besides helping with the identification of disease-causing variants for a range of diseases, ATAV has also enabled the discovery of disease-genes by rare-variant collapsing on datasets containing more than 20,000 samples. Analyses to date have been performed on data of more than 110,000 individuals demonstrating the scalability of the framework. To allow users to easily access variant-level data directly from the database, we provide a web-based interface, the ATAV data browser ( http://atavdb.org/ ). Through this browser, summary-level data for more than 40,000 samples can be queried by the general public representing a mix of cases and controls of diverse ancestries. Users have access to phenotype categories of variant carriers, as well as predicted ancestry, gender, and quality metrics. In contrast to many other platforms, the data browser is able to show data of newly-added samples in real-time and therefore evolves rapidly as more and more samples are sequenced.
CONCLUSIONS: Through ATAV, users have public access to one of the largest variant databases for patients sequenced at a tertiary care center and can look up any genes or variants of interest. Additionally, since the entire code is freely available on GitHub, ATAV can easily be deployed by other groups that wish to build their own platform, database, and user interface.

BMC Bioinformatics. 2021:22(1) | 30 Citations (from Europe PMC, 2025-12-20)

Ranking

All databases:
2003/6895 (70.964%)
Gene genome and annotation:
627/2021 (69.025%)
Genotype phenotype and variation:
294/1005 (70.846%)
2003
Total Rank
26
Citations
6.5
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2022-04-21
Curated by:
Lin Liu [2022-06-06]
Jing Wei [2022-05-16]
Yuxin Qin [2022-04-21]