Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

ECOD

General information

URL: http://prodata.swmed.edu/ecod
Full name: Evolutionary Classification of Protein Domains
Description: ECOD is a hierarchical classification of protein domains according to their evolutionary relationships. Only proteins with experimentally determined spatial structures from the PDB database are currently classified in ECOD.
Year founded: 2014
Last update: 2017-01-30
Version:
Accessibility:
Manual:
Accessible
Country/Region: United States

Classification & Tag

Data type:
Data object:
Database category:
Major species:
Keywords:

Contact information

University/Institution: University of Texas Southwestern Medical Center
Address: Department of Biophysics
City: Dallas
Province/State: Texas
Country/Region: United States
Contact name (PI/Team): R. Dustin Schaeffer
Contact email (PI/Helpdesk): Richard.Schaeffer@UTSouthwestern.edu

Publications

29659718
A sequence family database built on ECOD structural domains. [PMID: 29659718]
Liao Y, Schaeffer RD, Pei J, Grishin NV.

Motivation: The ECOD database classifies protein domains based on their evolutionary relationships, considering both remote and close homology. The family group in ECOD provides classification of domains that are closely related to each other based on sequence similarity. Due to different perspectives on domain definition, direct application of existing sequence domain databases, such as Pfam, to ECOD struggles with several shortcomings.
Results: We created multiple sequence alignments and profiles from ECOD domains with the help of structural information in alignment building and boundary delineation. We validated the alignment quality by scoring structure superposition to demonstrate that they are comparable to curated seed alignments in Pfam. Comparison to Pfam and CDD reveals that 27 and 16% of ECOD families are new, but they are also dominated by small families, likely because of the sampling bias from the PDB database. There are 35 and 48% of families whose boundaries are modified comparing to counterparts in Pfam and CDD, respectively.
Availability and implementation: The new families are now integrated in the ECOD website. The aggregate HMMER profile library and alignment are available for download on ECOD website (http://prodata.swmed.edu/ecod).
Supplementary information: Supplementary data are available at Bioinformatics online.

Bioinformatics. 2018:34(17) | 2 Citations (from Europe PMC, 2024-12-14)
27899594
ECOD: new developments in the evolutionary classification of domains. [PMID: 27899594]
Schaeffer RD, Liao Y, Cheng H, Grishin NV.

Evolutionary Classification Of protein Domains (ECOD) (http://prodata.swmed.edu/ecod) comprehensively classifies protein with known spatial structures maintained by the Protein Data Bank (PDB) into evolutionary groups of protein domains. ECOD relies on a combination of automatic and manual weekly updates to achieve its high accuracy and coverage with a short update cycle. ECOD classifies the approximately 120 000 depositions of the PDB into more than 500 000 domains in ?3400 homologous groups. We show the performance of the weekly update pipeline since the release of ECOD, describe improvements to the ECOD website and available search options, and discuss novel structures and homologous groups that have been classified in the recent updates. Finally, we discuss the future directions of ECOD and further improvements planned for the hierarchy and update process. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

Nucleic Acids Res. 2017:45(D1) | 44 Citations (from Europe PMC, 2024-12-14)
25917548
Manual classification strategies in the ECOD database. [PMID: 25917548]
Cheng H, Liao Y, Schaeffer RD, Grishin NV.

ECOD (Evolutionary Classification Of protein Domains) is a comprehensive and up-to-date protein structure classification database. The majority of new structures released from the PDB (Protein Data Bank) each week already have close homologs in the ECOD hierarchy and thus can be reliably partitioned into domains and classified by software without manual intervention. However, those proteins that lack confidently detectable homologs require careful analysis by experts. Although many bioinformatics resources rely on expert curation to some degree, specific examples of how this curation occurs and in what cases it is necessary are not always described. Here, we illustrate the manual classification strategy in ECOD by example, focusing on two major issues in protein classification: domain partitioning and the relationship between homology and similarity scores. Most examples show recently released and manually classified PDB structures. We discuss multi-domain proteins, discordance between sequence and structural similarities, difficulties with assessing homology with scores, and integral membrane proteins homologous to soluble proteins. By timely assimilation of newly available structures into its hierarchy, ECOD strives to provide a most accurate and updated view of the protein structure world as a result of combined computational and expert-driven analysis. © 2015 Wiley Periodicals, Inc.

Proteins. 2015:83(7) | 44 Citations (from Europe PMC, 2024-12-14)
25474468
ECOD: an evolutionary classification of protein domains. [PMID: 25474468]
Cheng H, Schaeffer RD, Liao Y, Kinch LN, Pei J, Shi S, Kim BH, Grishin NV.

Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or "fold"). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.

PLoS Comput Biol. 2014:10(12) | 199 Citations (from Europe PMC, 2024-12-14)

Ranking

All databases:
426/6266 (93.217%)
Structure:
43/871 (95.178%)
426
Total Rank
275
Citations
27.5
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2018-01-02
Curated by:
Lin Liu [2022-09-20]
Dong Zou [2019-01-02]
Shixiang Sun [2017-02-15]