Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

GenDiS

General information

URL: http://caps.ncbs.res.in/gendis/
Full name: Genomic Distribution of Protein Structural Domain Superfamilies
Description: GenDiS database provides a survey of protein domains enlisted in sequence databases employing a 3-fold sequence search approach.
Year founded: 2005
Last update: 2025-03-17
Version: v3.0
Accessibility:
Accessible
Country/Region: India

Classification & Tag

Data type:
Data object:
Database category:
Major species:
NA
Keywords:

Contact information

University/Institution: Tata Institute of Fundamental Research
Address: Bellary Road, Bangalore 560 065, Karnataka, India
City: Bangalore
Province/State: Karnataka
Country/Region: India
Contact name (PI/Team): Ramanathan Sowdhamini
Contact email (PI/Helpdesk): mini@ncbs.res.in

Publications

40343712
GenDiS3 database: census on the prevalence of protein domain superfamilies of known structure in the entire sequence database. [PMID: 40343712]
Joshi S, Mohapatra S, Kumar D, Joshi A, Iyer M, Sowdhamini R.

Despite the vast amount of sequence data available, a significant disparity exists between the number of protein sequences identified and the relatively few structures that have been resolved. This disparity highlights the challenge in structural biology to bridge the gap between sequence information and 3D structural data, and the necessity for robust databases capable of linking distant homologs to known structures. Studies have indicated that there are a limited number of structural folds, despite the vast diversity of proteins. Hence, computational tools can enhance our ability to classify protein sequences, much before their structures are determined or their functions are characterized, thereby bridging the gap between sequence and structural data. GenDiS (Genomic Distribution of Superfamilies) is a repository with information on the genomic distribution of protein domain superfamilies, involving a one-time computational exercise to search for trusted homologs of protein domains of known structures against the vast sequence database. We have updated this database employing advanced bioinformatics tools, including DELTA-BLAST (domain enhanced lookup time accelerated BLAST) for initial detection of hits and HMMSCAN for validation, significantly improving the accuracy of domain identification. Using these tools, over 151 million sequence homologs for 2060 superfamilies [SCOPe (Structural Classification of Proteins extended)] were identified and 116 million out of them were validated as true positives. Through a case study on glycolysis-related enzymes, variations in domain architectures of these enzymes are explored, revealing evolutionary changes and functional diversity among these essential proteins. We present another case, LOG gene, where one can tune in and find significant mutations across the evolutionary lineage. The GenDiS database, GenDiS3, and the associated tools made available at https://caps.ncbs.res.in/gendis3/ offer a powerful resource for researchers in functional annotation and evolutionary studies. Database URL: https://caps.ncbs.res.in/gendis3/.

Database (Oxford). 2025:2025() | 0 Citations (from Europe PMC, 2025-12-13)
15608190
GenDiS: Genomic Distribution of protein structural domain Superfamilies. [PMID: 15608190]
Pugalenthi G, Bhaduri A, Sowdhamini R.

Several proteins that have substantially diverged during evolution retain similar three-dimensional structures and biological function inspite of poor sequence identity. The database on Genomic Distribution of protein structural domain Superfamilies (GenDiS) provides record for the distribution of 4001 protein domains organized as 1194 structural superfamilies across 18,997 genomes at various levels of hierarchy in taxonomy. GenDiS database provides a survey of protein domains enlisted in sequence databases employing a 3-fold sequence search approach. Lineage-specific literature is obtained from the taxonomy database for individual protein members to provide a platform for performing genomic and phyletic studies across organisms. The database documents residual properties and provides alignments for the various superfamily members in genomes, offering insights into the rational design of experiments and for the better understanding of a superfamily. GenDiS database can be accessed at http://www.ncbs.res.in/~faculty/mini/gendis/home.html.

Nucleic Acids Res. 2005:33(Database issue) | 14 Citations (from Europe PMC, 2025-12-13)

Ranking

All databases:
5767/6895 (16.374%)
Structure:
803/967 (17.063%)
Gene genome and annotation:
1737/2021 (14.102%)
5767
Total Rank
14
Citations
0.7
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2015-07-11
Curated by:
shaosen zhang [2025-08-05]
Lin Liu [2016-03-26]
Mengwei Li [2016-02-20]
Jian Sang [2015-12-11]