| URL: | http://hsdfinder.com/database/ |
| Full name: | a database of highly similar duplicate genes |
| Description: | Gene duplication is an important evolutionary mechanism capable of providing new genetic material, and recent examples have indicated that highly similar duplicate genes (HSDs) are aiding adaptation to extreme conditions via gene dosage. However, for most eukaryotic genomes HSDs remain uncharacterized, partly because they can be hard to identify and categorize efficiently and effectively. Here, we collected and curated HSDs from various model animals, land plants and algae and indexed them in an online, open-access sequence repository called HSDatabase. It can be used on its own for comparative analyses of gene duplicates or in conjunction with HSDFinder, a newly developed bioinformatics tool for identifying, annotating, categorizing and visualizing HSDs. |
| Year founded: | 2021 |
| Last update: | 2022 |
| Version: | v1.0 |
| Accessibility: |
Accessible
|
| Country/Region: | Canada |
| Data type: | |
| Data object: | |
| Database category: | |
| Major species: | |
| Keywords: |
| University/Institution: | University of Western Ontario |
| Address: | |
| City: | London |
| Province/State: | Ontario |
| Country/Region: | Canada |
| Contact name (PI/Team): | Xi Zhang |
| Contact email (PI/Helpdesk): | xzha25@uwo.ca |
|
HSDatabase-a database of highly similar duplicate genes from plants, animals, and algae. [PMID: 36208223]
Gene duplication is an important evolutionary mechanism capable of providing new genetic material, which in some instances can help organisms adapt to various environmental conditions. Recent studies, for example, have indicated that highly similar duplicate genes (HSDs) are aiding adaptation to extreme conditions via gene dosage. However, for most eukaryotic genomes HSDs remain uncharacterized, partly because they can be hard to identify and categorize efficiently and effectively. Here, we collected and curated HSDs in nuclear genomes from various model animals, land plants and algae and indexed them in an online, open-access sequence repository called HSDatabase. Currently, this database contains 117 864 curated HSDs from 40 distinct genomes; it includes statistics on the total number of HSDs per genome as well as individual HSD copy numbers/lengths and provides sequence alignments of the duplicate gene copies. HSDatabase also allows users to download sequences of gene copies, access genome browsers, and link out to other databases, such as Pfam and Kyoto Encyclopedia of Genes and Genomes. What is more, a built-in Basic Local Alignment Search Tool option is available to conveniently explore potential homologous sequences of interest within and across species. HSDatabase has a user-friendly interface and provides easy access to the source data. It can be used on its own for comparative analyses of gene duplicates or in conjunction with HSDFinder, a newly developed bioinformatics tool for identifying, annotating, categorizing and visualizing HSDs. Database URL: http://hsdfinder.com/database/. |
|
Protocol for HSDFinder: Identifying, annotating, categorizing, and visualizing duplicated genes in eukaryotic genomes. [PMID: 34223195]
Although gene duplications have been documented in many species, the precise numbers of highly similar duplicated genes (HSDs) in eukaryotic nuclear genomes remain largely unknown and can be time-consuming to explore. We developed HSDFinder to identify, categorize, and visualize HSDs in eukaryotic nuclear genomes using protein family domains and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. In contrast to existing tools, HSDFinder allows users to compare HSDs among different species and visualize results in different KEGG pathway functional categories via heatmap plotting. For complete details on the use and execution of this protocol, please refer to Zhang et al. (2021). |
|
HSDFinder: A BLAST-Based Strategy for Identifying Highly Similar Duplicated Genes in Eukaryotic Genomes. [PMID: 36303740]
Gene duplication is an important evolutionary mechanism capable of providing new genetic material for adaptive and nonadaptive evolution. However, bioinformatics tools for identifying duplicate genes are often limited to the detection of paralogs in multiple species or to specific types of gene duplicates, such as retrocopies. Here, we present a user-friendly, BLAST-based web tool, called HSDFinder, which can identify, annotate, categorize, and visualize highly similar duplicate genes (HSDs) in eukaryotic nuclear genomes. HSDFinder includes an online heatmap plotting option, allowing users to compare HSDs among different species and visualize the results in different Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway functional categories. The external software requirements are BLAST, InterProScan, and KEGG. The utility of HSDFinder was tested on various model eukaryotic species, including Chlamydomonas reinhardtii, Arabidopsis thaliana, Oryza sativa, and Zea mays as well as the psychrophilic green alga Chlamydomonas sp. UWO241, and was proven to be a practical and accurate tool for gene duplication analyses. The web tool is free to use at http://hsdfinder.com. Documentation and tutorials can be found via the GitHub: https://github.com/zx0223winner/HSDFinder. |