Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

ATGCs

General information

URL: http://dmk-brain.ecn.uiowa.edu/ATGC/
Full name: Alignable Tight Genomic Clusters
Description: ATGC is a resource for micro- and macro-evolutionary studies of Bacteria and Archaea - in particular, pan-genomic studies of genes shared across multiple of these closely-related organisms.
Year founded: 2009
Last update: 2017-01-01
Version:
Accessibility:
Accessible
Country/Region: United States

Classification & Tag

Data type:
DNA
Data object:
Database category:
Major species:
Keywords:

Contact information

University/Institution: National Center for Biotechnology Information
Address: National Center for Biotechnology Information, National Library of Medicine
City: Bethesda
Province/State: Maryland
Country/Region: United States
Contact name (PI/Team): David M. Kristensen
Contact email (PI/Helpdesk): david-kristensen@uiowa.edu

Publications

28053163
ATGC database and ATGC-COGs: an updated resource for micro- and macro-evolutionary studies of prokaryotic genomes and protein family annotation. [PMID: 28053163]
Kristensen DM, Wolf YI, Koonin EV.

The Alignable Tight Genomic Clusters (ATGCs) database is a collection of closely related bacterial and archaeal genomes that provides several tools to aid research into evolutionary processes in the microbial world. Each ATGC is a taxonomy-independent cluster of 2 or more completely sequenced genomes that meet the objective criteria of a high degree of local gene order (synteny) and a small number of synonymous substitutions in the protein-coding genes. As such, each ATGC is suited for analysis of microevolutionary variations within a cohesive group of organisms (e.g. species), whereas the entire collection of ATGCs is useful for macroevolutionary studies. The ATGC database includes many forms of pre-computed data, in particular ATGC-COGs (Clusters of Orthologous Genes), multiple sequence alignments, a set of 'index' orthologs representing the most well-conserved members of each ATGC-COG, the phylogenetic tree of the organisms within each ATGC, etc. Although the ATGC database contains several million proteins from thousands of genomes organized into hundreds of clusters (roughly a 4-fold increase since the last version of the ATGC database), it is now built with completely automated methods and will be regularly updated following new releases of the NCBI RefSeq database. The ATGC database is hosted jointly at the University of Iowa at dmk-brain.ecn.uiowa.edu/ATGC/ and the NCBI at ftp.ncbi.nlm.nih.gov/pub/kristensen/ATGC/atgc_home.html. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

Nucleic Acids Res. 2017:45(D1) | 32 Citations (from Europe PMC, 2025-12-13)
18845571
ATGC: a database of orthologous genes from closely related prokaryotic genomes and a research platform for microevolution of prokaryotes. [PMID: 18845571]
Novichkov PS, Ratnere I, Wolf YI, Koonin EV, Dubchak I.

The database of Alignable Tight Genomic Clusters (ATGCs) consists of closely related genomes of archaea and bacteria, and is a resource for research into prokaryotic microevolution. Construction of a data set with appropriate characteristics is a major hurdle for this type of studies. With the current rate of genome sequencing, it is difficult to follow the progress of the field and to determine which of the available genome sets meet the requirements of a given research project, in particular, with respect to the minimum and maximum levels of similarity between the included genomes. Additionally, extraction of specific content, such as genomic alignments or families of orthologs, from a selected set of genomes is a complicated and time-consuming process. The database addresses these problems by providing an intuitive and efficient web interface to browse precomputed ATGCs, select appropriate ones and access ATGC-derived data such as multiple alignments of orthologous proteins, matrices of pairwise intergenomic distances based on genome-wide analysis of synonymous and nonsynonymous substitution rates and others. The ATGC database will be regularly updated following new releases of the NCBI RefSeq. The database is hosted by the Genomics Division at Lawrence Berkeley National laboratory and is publicly available at http://atgc.lbl.gov.

Nucleic Acids Res. 2009:37(Database issue) | 48 Citations (from Europe PMC, 2025-12-13)

Ranking

All databases:
2424/6895 (64.859%)
Raw bio-data:
173/582 (70.447%)
Phylogeny and homology:
112/302 (63.245%)
Standard ontology and nomenclature:
103/238 (57.143%)
2424
Total Rank
79
Citations
4.938
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2015-07-26
Curated by:
Pei Liu [2022-08-23]
Dong Zou [2018-03-09]
Lina Ma [2017-06-21]
Shixiang Sun [2017-02-14]
Zhang Zhang [2016-04-26]
Mengwei Li [2016-04-12]
Mengwei Li [2016-03-31]
Mengwei Li [2016-03-28]
Mengwei Li [2015-11-27]