Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

General information

URL: http://orthomcl.org
Full name: Querying a Comprehensive Multi-species Collection of Ortholog Groups
Description: The OrthoMCL database (http://orthomcl.cbil.upenn.edu) houses ortholog group predictions for 55 species, including 16 bacterial and 4 archaeal genomes representing phylogenetically diverse lineages, and most currently available complete eukaryotic genomes: 24 unikonts (12 animals, 9 fungi, microsporidium, Dictyostelium, Entamoeba), 4 plants/algae and 7 apicomplexan parasites.
Year founded: 2006
Last update: 2015-07-23
Version: v5
Accessibility:
Manual:
Unaccessible
Real time : Checking...
Country/Region: United States

Classification & Tag

Data type:
DNA
Data object:
Database category:
Major species:
NA
Keywords:

Contact information

University/Institution: University of Pennsylvania
Address: Philadelphia,PA 19104-6018, USA
City: Philadelphia
Province/State: PA
Country/Region: United States
Contact name (PI/Team): David S. Roos
Contact email (PI/Helpdesk): droos@sas.upenn.edu

Publications

16381887
OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups. [PMID: 16381887]
Chen F, Mackey AJ, Stoeckert CJ, Roos DS.

The OrthoMCL database (http://orthomcl.cbil.upenn.edu) houses ortholog group predictions for 55 species, including 16 bacterial and 4 archaeal genomes representing phylogenetically diverse lineages, and most currently available complete eukaryotic genomes: 24 unikonts (12 animals, 9 fungi, microsporidium, Dictyostelium, Entamoeba), 4 plants/algae and 7 apicomplexan parasites. OrthoMCL software was used to cluster proteins based on sequence similarity, using an all-against-all BLAST search of each species' proteome, followed by normalization of inter-species differences, and Markov clustering. A total of 511,797 proteins (81.6% of the total dataset) were clustered into 70,388 ortholog groups. The ortholog database may be queried based on protein or group accession numbers, keyword descriptions or BLAST similarity. Ortholog groups exhibiting specific phyletic patterns may also be identified, using either a graphical interface or a text-based Phyletic Pattern Expression grammar. Information for ortholog groups includes the phyletic profile, the list of member proteins and a multiple sequence alignment, a statistical summary and graphical view of similarities, and a graphical representation of domain architecture. OrthoMCL software, the entire FASTA dataset employed and clustering results are available for download. OrthoMCL-DB provides a centralized warehouse for orthology prediction among multiple species, and will be updated and expanded as additional genome sequence data become available.

Nucleic Acids Res. 2006:34(Database issue) | 556 Citations (from Europe PMC, 2024-04-06)

Ranking

All databases:
365/6000 (93.933%)
Phylogeny and homology:
21/259 (92.278%)
365
Total Rank
556
Citations
30.889
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2015-09-06
Curated by:
Lina Ma [2018-06-08]
Lina Ma [2016-08-23]
Lin Liu [2016-04-17]
Lin Liu [2016-03-28]
Mengwei Li [2016-02-21]
Lina Ma [2015-11-11]