Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

HSDatabase

General information

URL: http://hsdfinder.com/database/
Full name: a database of highly similar duplicate genes
Description: Gene duplication is an important evolutionary mechanism capable of providing new genetic material, and recent examples have indicated that highly similar duplicate genes (HSDs) are aiding adaptation to extreme conditions via gene dosage. However, for most eukaryotic genomes HSDs remain uncharacterized, partly because they can be hard to identify and categorize efficiently and effectively. Here, we collected and curated HSDs from various model animals, land plants and algae and indexed them in an online, open-access sequence repository called HSDatabase. It can be used on its own for comparative analyses of gene duplicates or in conjunction with HSDFinder, a newly developed bioinformatics tool for identifying, annotating, categorizing and visualizing HSDs.
Year founded: 2021
Last update: 2022
Version: v1.0
Accessibility:
Accessible
Country/Region: Canada

Funding support

  • NSERC

Classification & Tag

Data type:
Data object:
Database category:
Major species:
Keywords:

Contact information

University/Institution: University of Western Ontario
Address:
City: London
Province/State: Ontario
Country/Region: Canada
Contact name (PI/Team): Xi Zhang
Contact email (PI/Helpdesk): xzha25@uwo.ca

Publications

36208223
HSDatabase-a database of highly similar duplicate genes from plants, animals, and algae. [PMID: 36208223]
Zhang X, Hu Y, Smith DR.

Gene duplication is an important evolutionary mechanism capable of providing new genetic material, which in some instances can help organisms adapt to various environmental conditions. Recent studies, for example, have indicated that highly similar duplicate genes (HSDs) are aiding adaptation to extreme conditions via gene dosage. However, for most eukaryotic genomes HSDs remain uncharacterized, partly because they can be hard to identify and categorize efficiently and effectively. Here, we collected and curated HSDs in nuclear genomes from various model animals, land plants and algae and indexed them in an online, open-access sequence repository called HSDatabase. Currently, this database contains 117 864 curated HSDs from 40 distinct genomes; it includes statistics on the total number of HSDs per genome as well as individual HSD copy numbers/lengths and provides sequence alignments of the duplicate gene copies. HSDatabase also allows users to download sequences of gene copies, access genome browsers, and link out to other databases, such as Pfam and Kyoto Encyclopedia of Genes and Genomes. What is more, a built-in Basic Local Alignment Search Tool option is available to conveniently explore potential homologous sequences of interest within and across species. HSDatabase has a user-friendly interface and provides easy access to the source data. It can be used on its own for comparative analyses of gene duplicates or in conjunction with HSDFinder, a newly developed bioinformatics tool for identifying, annotating, categorizing and visualizing HSDs. Database URL: http://hsdfinder.com/database/.

Database (Oxford). 2022:2022() | 5 Citations (from Europe PMC, 2025-12-13)
34223195
Protocol for HSDFinder: Identifying, annotating, categorizing, and visualizing duplicated genes in eukaryotic genomes. [PMID: 34223195]
Zhang X, Hu Y, Smith DR.

Although gene duplications have been documented in many species, the precise numbers of highly similar duplicated genes (HSDs) in eukaryotic nuclear genomes remain largely unknown and can be time-consuming to explore. We developed HSDFinder to identify, categorize, and visualize HSDs in eukaryotic nuclear genomes using protein family domains and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. In contrast to existing tools, HSDFinder allows users to compare HSDs among different species and visualize results in different KEGG pathway functional categories via heatmap plotting. For complete details on the use and execution of this protocol, please refer to Zhang et al. (2021).

STAR Protoc. 2021:2(3) | 12 Citations (from Europe PMC, 2025-12-13)
36303740
HSDFinder: A BLAST-Based Strategy for Identifying Highly Similar Duplicated Genes in Eukaryotic Genomes. [PMID: 36303740]
Zhang X, Hu Y, Smith DR.

Gene duplication is an important evolutionary mechanism capable of providing new genetic material for adaptive and nonadaptive evolution. However, bioinformatics tools for identifying duplicate genes are often limited to the detection of paralogs in multiple species or to specific types of gene duplicates, such as retrocopies. Here, we present a user-friendly, BLAST-based web tool, called HSDFinder, which can identify, annotate, categorize, and visualize highly similar duplicate genes (HSDs) in eukaryotic nuclear genomes. HSDFinder includes an online heatmap plotting option, allowing users to compare HSDs among different species and visualize the results in different Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway functional categories. The external software requirements are BLAST, InterProScan, and KEGG. The utility of HSDFinder was tested on various model eukaryotic species, including Chlamydomonas reinhardtii, Arabidopsis thaliana, Oryza sativa, and Zea mays as well as the psychrophilic green alga Chlamydomonas sp. UWO241, and was proven to be a practical and accurate tool for gene duplication analyses. The web tool is free to use at http://hsdfinder.com. Documentation and tutorials can be found via the GitHub: https://github.com/zx0223winner/HSDFinder.

Front Bioinform. 2021:1() | 10 Citations (from Europe PMC, 2025-12-13)

Ranking

All databases:
2178/6895 (68.426%)
Gene genome and annotation:
680/2021 (66.403%)
2178
Total Rank
23
Citations
5.75
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2022-11-25
Curated by:
Lina Ma [2022-12-08]
Michael Chang [2022-11-30]
Lina Ma [2022-11-26]
Michael Chang [2022-11-25]