Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

ECOD

General information

URL: http://prodata.swmed.edu/ecod
Full name: Evolutionary Classification of Protein Domains
Description: ECOD is a hierarchical classification of protein domains according to their evolutionary relationships. Only proteins with experimentally determined spatial structures from the PDB database are currently classified in ECOD.
Year founded: 2014
Last update: 2017-01-30
Version:
Accessibility:
Accessible
Country/Region: United States

Classification & Tag

Data type:
Data object:
Database category:
Major species:
Keywords:

Contact information

University/Institution: University of Texas Southwestern Medical Center
Address: Department of Biophysics
City: Dallas
Province/State: Texas
Country/Region: United States
Contact name (PI/Team): R. Dustin Schaeffer
Contact email (PI/Helpdesk): Richard.Schaeffer@UTSouthwestern.edu

Publications

39565196
ECOD: integrating classifications of protein domains from experimental and predicted structures. [PMID: 39565196]
R Dustin Schaeffer, Kirill E Medvedev, Antonina Andreeva, Sara Rocio Chuguransky, Beatriz Lazaro Pinto, Jing Zhang, Qian Cong, Alex Bateman, Nick V Grishin

The evolutionary classification of protein domains (ECOD) classifies protein domains using a combination of sequence and structural data (http://prodata.swmed.edu/ecod). Here we present the culmination of our previous efforts at classifying domains from predicted structures, principally from the AlphaFold Database (AFDB), by integrating these domains with our existing classification of PDB structures. This combined classification includes both domains from our previous, purely experimental, classification of domains as well as domains from our provisional classification of 48 proteomes in AFDB predicted from model organisms and organisms of concern to global health. ECOD classifies over 1.8 M domains from over 1000 000 proteins collectively deposited in the PDB and AFDB. Additionally, we have changed the F-group classification reference used for ECOD, deprecating our original ECODf library and instead relying on direct collaboration with the Pfam sequence family database to inform our classification. Pfam provides similar coverage of ECOD with family classification while being more accurate and less redundant. By eliminating duplication of effort, we can improve both classifications. Finally, we discuss the initial deployment of DrugDomain, a database of domain-ligand interactions, on ECOD and discuss future plans.

Nucleic Acids Res. 2025:53(D1) | 16 Citations (from Europe PMC, 2025-12-13)
29659718
A sequence family database built on ECOD structural domains. [PMID: 29659718]
Liao Y, Schaeffer RD, Pei J, Grishin NV.

Motivation: The ECOD database classifies protein domains based on their evolutionary relationships, considering both remote and close homology. The family group in ECOD provides classification of domains that are closely related to each other based on sequence similarity. Due to different perspectives on domain definition, direct application of existing sequence domain databases, such as Pfam, to ECOD struggles with several shortcomings.
Results: We created multiple sequence alignments and profiles from ECOD domains with the help of structural information in alignment building and boundary delineation. We validated the alignment quality by scoring structure superposition to demonstrate that they are comparable to curated seed alignments in Pfam. Comparison to Pfam and CDD reveals that 27 and 16% of ECOD families are new, but they are also dominated by small families, likely because of the sampling bias from the PDB database. There are 35 and 48% of families whose boundaries are modified comparing to counterparts in Pfam and CDD, respectively.
Availability and implementation: The new families are now integrated in the ECOD website. The aggregate HMMER profile library and alignment are available for download on ECOD website (http://prodata.swmed.edu/ecod).
Supplementary information: Supplementary data are available at Bioinformatics online.

Bioinformatics. 2018:34(17) | 5 Citations (from Europe PMC, 2025-12-13)
27899594
ECOD: new developments in the evolutionary classification of domains. [PMID: 27899594]
Schaeffer RD, Liao Y, Cheng H, Grishin NV.

Evolutionary Classification Of protein Domains (ECOD) (http://prodata.swmed.edu/ecod) comprehensively classifies protein with known spatial structures maintained by the Protein Data Bank (PDB) into evolutionary groups of protein domains. ECOD relies on a combination of automatic and manual weekly updates to achieve its high accuracy and coverage with a short update cycle. ECOD classifies the approximately 120 000 depositions of the PDB into more than 500 000 domains in ?3400 homologous groups. We show the performance of the weekly update pipeline since the release of ECOD, describe improvements to the ECOD website and available search options, and discuss novel structures and homologous groups that have been classified in the recent updates. Finally, we discuss the future directions of ECOD and further improvements planned for the hierarchy and update process. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

Nucleic Acids Res. 2017:45(D1) | 71 Citations (from Europe PMC, 2025-12-13)
25917548
Manual classification strategies in the ECOD database. [PMID: 25917548]
Cheng H, Liao Y, Schaeffer RD, Grishin NV.

ECOD (Evolutionary Classification Of protein Domains) is a comprehensive and up-to-date protein structure classification database. The majority of new structures released from the PDB (Protein Data Bank) each week already have close homologs in the ECOD hierarchy and thus can be reliably partitioned into domains and classified by software without manual intervention. However, those proteins that lack confidently detectable homologs require careful analysis by experts. Although many bioinformatics resources rely on expert curation to some degree, specific examples of how this curation occurs and in what cases it is necessary are not always described. Here, we illustrate the manual classification strategy in ECOD by example, focusing on two major issues in protein classification: domain partitioning and the relationship between homology and similarity scores. Most examples show recently released and manually classified PDB structures. We discuss multi-domain proteins, discordance between sequence and structural similarities, difficulties with assessing homology with scores, and integral membrane proteins homologous to soluble proteins. By timely assimilation of newly available structures into its hierarchy, ECOD strives to provide a most accurate and updated view of the protein structure world as a result of combined computational and expert-driven analysis. © 2015 Wiley Periodicals, Inc.

Proteins. 2015:83(7) | 66 Citations (from Europe PMC, 2025-12-13)
25474468
ECOD: an evolutionary classification of protein domains. [PMID: 25474468]
Cheng H, Schaeffer RD, Liao Y, Kinch LN, Pei J, Shi S, Kim BH, Grishin NV.

Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or "fold"). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.

PLoS Comput Biol. 2014:10(12) | 319 Citations (from Europe PMC, 2025-12-13)

Ranking

All databases:
401/6895 (94.199%)
Structure:
45/967 (95.45%)
401
Total Rank
445
Citations
40.454
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2017-02-15
Curated by:
Yiran Zhan [2025-07-01]
Lin Liu [2022-09-20]
Dong Zou [2019-01-02]
Shixiang Sun [2017-02-15]