Introduction

Next generation sequencing (NGS) technologies have gained considerable popularity among biologists. For example, RNA-seq, which provides both genomic and functional information, has been widely used by recent functional and evolutionary studies, especially in non-model organisms. However, storing and transmitting these large data sets (primarily in FASTQ format) have become genuine challenges, especially for biologists with little informatics experience. Data compression is thus a necessity. KIC, a FASTQ compressor based on a new integer-mapped k-mer indexing method, was developed (available at http://www.ysunlab.org/kic.jsp). It offers high compression ratio on sequence data, outstanding user-friendliness with graphic user interfaces, and proven reliability. Evaluated on multiple large RNA-seq data sets from both human and plants, it was found that the compression ratio of KIC had exceeded all major generic compressors, and was comparable to those of the latest dedicated compressors. KIC enables researchers with minimal informatics training to take advantage of the latest sequence compression technologies, easily manage large FASTQ data sets, and reduce storage and transmission cost.

Publications

  1. A FASTQ compressor based on integer-mapped k-mer indexing for biologist.
    Cite this
    Zhang Y, Patel K, Endrawis T, Bowers A, Sun Y, 2016-03-01 - Gene

Credits

  1. Yeting Zhang
    Developer

    Department of Research, Synblex LLC

  2. Khyati Patel
    Developer

    New Jersey Center for Science Technology and Mathematics, Kean University

  3. Tony Endrawis
    Developer

    New Jersey Center for Science Technology and Mathematics, Kean University

  4. Autumn Bowers
    Developer

    New Jersey Center for Science Technology and Mathematics, Kean University

  5. Yazhou Sun
    Investigator

    New Jersey Center for Science Technology and Mathematics, Kean University

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT006391
Tool TypeApplication
Category
PlatformsLinux/Unix
Technologies
User InterfaceTerminal Command Line
Download Count0
Submitted ByYazhou Sun