| URL: | http://darkhorse.ucsd.edu |
| Full name: | DarkHorse HGT Candidate Resource |
| Description: | DarkHorse is a bioinformatic method for rapid, automated identification and ranking of phylogenetically atypical proteins on a genome-wide basis. It works by selecting potential ortholog matches from a reference database of amino acid sequences, then using these matches to calculate a lineage probability index (LPI) score for each genome protein. |
| Year founded: | 2007 |
| Last update: | 2008 |
| Version: | 2 |
| Accessibility: |
Accessible
|
| Country/Region: | United States |
| Data type: | |
| Data object: | |
| Database category: | |
| Major species: | |
| Keywords: |
| University/Institution: | University of California San Diego |
| Address: | |
| City: | San Diego |
| Province/State: | |
| Country/Region: | United States |
| Contact name (PI/Team): | Sheila Podell |
| Contact email (PI/Helpdesk): | spodell@ucsd.edu |
|
A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm. [PMID: 18840280]
BACKGROUND: The process of horizontal gene transfer (HGT) is believed to be widespread in Bacteria and Archaea, but little comparative data is available addressing its occurrence in complete microbial genomes. Collection of high-quality, automated HGT prediction data based on phylogenetic evidence has previously been impractical for large numbers of genomes at once, due to prohibitive computational demands. DarkHorse, a recently described statistical method for discovering phylogenetically atypical genes on a genome-wide basis, provides a means to solve this problem through lineage probability index (LPI) ranking scores. LPI scores inversely reflect phylogenetic distance between a test amino acid sequence and its closest available database matches. Proteins with low LPI scores are good horizontal gene transfer candidates; those with high scores are not. |
|
DarkHorse: a method for genome-wide prediction of horizontal gene transfer. [PMID: 17274820]
A new approach to rapid, genome-wide identification and ranking of horizontal transfer candidate proteins is presented. The method is quantitative, reproducible, and computationally undemanding. It can be combined with genomic signature and/or phylogenetic tree-building procedures to improve accuracy and efficiency. The method is also useful for retrospective assessments of horizontal transfer prediction reliability, recognizing orthologous sequences that may have been previously overlooked or unavailable. These features are demonstrated in bacterial, archaeal, and eukaryotic examples. |