Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

DarkHorse

General information

URL: http://darkhorse.ucsd.edu
Full name: DarkHorse HGT Candidate Resource
Description: DarkHorse is a bioinformatic method for rapid, automated identification and ranking of phylogenetically atypical proteins on a genome-wide basis. It works by selecting potential ortholog matches from a reference database of amino acid sequences, then using these matches to calculate a lineage probability index (LPI) score for each genome protein.
Year founded: 2007
Last update: 2008
Version: 2
Accessibility:
Accessible
Country/Region: United States

Classification & Tag

Data type:
Data object:
Database category:
Major species:
Keywords:

Contact information

University/Institution: University of California San Diego
Address:
City: San Diego
Province/State:
Country/Region: United States
Contact name (PI/Team): Sheila Podell
Contact email (PI/Helpdesk): spodell@ucsd.edu

Publications

18840280
A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm. [PMID: 18840280]
Podell S, Gaasterland T, Allen EE.

BACKGROUND: The process of horizontal gene transfer (HGT) is believed to be widespread in Bacteria and Archaea, but little comparative data is available addressing its occurrence in complete microbial genomes. Collection of high-quality, automated HGT prediction data based on phylogenetic evidence has previously been impractical for large numbers of genomes at once, due to prohibitive computational demands. DarkHorse, a recently described statistical method for discovering phylogenetically atypical genes on a genome-wide basis, provides a means to solve this problem through lineage probability index (LPI) ranking scores. LPI scores inversely reflect phylogenetic distance between a test amino acid sequence and its closest available database matches. Proteins with low LPI scores are good horizontal gene transfer candidates; those with high scores are not.
DESCRIPTION: The DarkHorse algorithm has been applied to 955 microbial genome sequences, and the results organized into a web-searchable relational database, called the DarkHorse HGT Candidate Resource http://darkhorse.ucsd.edu. Users can select individual genomes or groups of genomes to screen by LPI score, search for protein functions by descriptive annotation or amino acid sequence similarity, or select proteins with unusual G+C composition in their underlying coding sequences. The search engine reports LPI scores for match partners as well as query sequences, providing the opportunity to explore whether potential HGT donor sequences are phylogenetically typical or atypical within their own genomes. This information can be used to predict whether or not sufficient information is available to build a well-supported phylogenetic tree using the potential donor sequence.
CONCLUSION: The DarkHorse HGT Candidate database provides a powerful, flexible set of tools for identifying phylogenetically atypical proteins, allowing researchers to explore both individual HGT events in single genomes, and large-scale HGT patterns among protein families and genome groups. Although the DarkHorse algorithm cannot, by itself, provide definitive proof of horizontal gene transfer, it is a flexible, powerful tool that can be combined with slower, more rigorous methods in situations where these other methods could not otherwise be applied.

BMC Bioinformatics. 2008:9() | 30 Citations (from Europe PMC, 2026-04-04)
17274820
DarkHorse: a method for genome-wide prediction of horizontal gene transfer. [PMID: 17274820]
Podell S, Gaasterland T.

A new approach to rapid, genome-wide identification and ranking of horizontal transfer candidate proteins is presented. The method is quantitative, reproducible, and computationally undemanding. It can be combined with genomic signature and/or phylogenetic tree-building procedures to improve accuracy and efficiency. The method is also useful for retrospective assessments of horizontal transfer prediction reliability, recognizing orthologous sequences that may have been previously overlooked or unavailable. These features are demonstrated in bacterial, archaeal, and eukaryotic examples.

Genome Biol. 2007:8(2) | 129 Citations (from Europe PMC, 2026-04-04)

Ranking

All databases:
1499/6932 (78.39%)
Interaction:
292/1200 (75.75%)
Pathway:
91/454 (80.176%)
1499
Total Rank
157
Citations
8.263
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2018-01-26
Curated by:
Nashaiman Pervaiz [2018-12-28]
Mengyu Pan [2018-09-20]
Mengyu Pan [2018-02-21]
Lina Ma [2018-01-26]