Introduction

Hashing has been widely used for indexing, querying and rapid similarity search in many bioinformatics applications, including sequence alignment, genome and transcriptome assembly, k-mer counting and error correction. Hence, expediting hashing operations would have a substantial impact in the field, making bioinformatics applications faster and more efficient.We present ntHash, a hashing algorithm tuned for processing DNA/RNA sequences. It performs the best when calculating hash values for adjacent k-mers in an input sequence, operating an order of magnitude faster than the best performing alternatives in typical use cases.ntHash is available online at http://www.bcgsc.ca/platform/bioinfo/software/nthash and is free for academic use.hmohamadi@bcgsc.ca or ibirol@bcgsc.caSupplementary information: Supplementary data are available at Bioinformatics online.

Publications

  1. ntHash: recursive nucleotide hashing.
    Cite this
    Mohamadi H, Chu J, Vandervalk BP, Birol I, 2016-11-01 - Bioinformatics (Oxford, England)

Credits

  1. Hamid Mohamadi
    Developer

    Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Canada

  2. Justin Chu
    Developer

    Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Canada

  3. Benjamin P Vandervalk
    Developer

    Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Canada

  4. Inanc Birol
    Investigator

    Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Canada

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT000118
Tool TypeApplication
Category
PlatformsLinux/Unix
TechnologiesC++
User InterfaceTerminal Command Line
Download Count0
Country/RegionCanada
Submitted ByInanc Birol