Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

General information

URL: http://www.ebi.ac.uk/research/cgg/tribes
Full name: A resource contains protein family information, comprising annotations, protein sequence alignments and phylogenetic distributions.
Description: We presented a novel algorithm called TribeMCL for the detection of protein families that is both accurate and efficient. This method allows family analysis to be carried out on a very large scale. Using TribeMCL, we have generated a resource called TRIBES that contains protein family information, comprising annotations, protein sequence alignments and phylogenetic distributions describing 311 257 proteins from 83 completely sequenced genomes.
Year founded: 2003
Last update:
Version:
Accessibility:
Manual:
Unaccessible
Real time : Checking...
Country/Region: United Kingdom

Classification & Tag

Data type:
Data object:
Database category:
Major species:
Keywords:

Contact information

University/Institution: European Bioinformatics Institute
Address: Computational Genomics Group, The European Bioinformatics Institute, EMBL Cambridge Outstation, Cambridge CB10 1SD, UK
City: Cambridge
Province/State:
Country/Region: United Kingdom
Contact name (PI/Team): Christos A. Ouzounis
Contact email (PI/Helpdesk): ouzounis@ebi.ac.uk

Publications

12888524
Protein families and TRIBES in genome sequence space. [PMID: 12888524]
Enright AJ, Kunin V, Ouzounis CA.

Accurate detection of protein families allows assignment of protein function and the analysis of functional diversity in complete genomes. Recently, we presented a novel algorithm called TribeMCL for the detection of protein families that is both accurate and efficient. This method allows family analysis to be carried out on a very large scale. Using TribeMCL, we have generated a resource called TRIBES that contains protein family information, comprising annotations, protein sequence alignments and phylogenetic distributions describing 311 257 proteins from 83 completely sequenced genomes. The analysis of at least 60 934 detected protein families reveals that, with the essential families excluded, paralogy levels are similar between prokaryotes, irrespective of genome size. The number of essential families is estimated to be between 366 and 426. We also show that the currently known space of protein families is scale free and discuss the implications of this distribution. In addition, we show that smaller families are often formed by shorter proteins and discuss the reasons for this intriguing pattern. Finally, we analyse the functional diversity of protein families in entire genome sequences. The TRIBES protein family resource is accessible at http://www.ebi.ac.uk/research/cgg/tribes/.

Nucleic Acids Res. 2003:31(15) | 88 Citations (from Europe PMC, 2024-04-06)

Ranking

All databases:
1952/6000 (67.483%)
Gene genome and annotation:
584/1675 (65.194%)
1952
Total Rank
88
Citations
4.19
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2018-02-09
Curated by:
Mengyu Pan [2018-09-21]
Qi Wang [2018-02-24]
Yang Zhang [2018-02-09]