ParDRe

Introduction

Current next generation sequencing technologies often generate duplicated or near-duplicated reads that (depending on the application scenario) do not provide any interesting biological information but can increase memory requirements and computational time of downstream analysis. In this work we present ParDRe, a de novo parallel tool to remove duplicated and near-duplicated reads through the clustering of Single-End or Paired-End sequences from fasta or fastq files. It uses a novel bitwise approach to compare the suffixes of DNA strings and employs hybrid MPI/multithreading to reduce runtime on multicore systems. We show that ParDRe is up to 27.29 times faster than Fulcrum (a representative state-of-the-art tool) on a platform with two 8-core Sandy-Bridge processors.Source code in C ++ and MPI running on Linux systems as well as a reference manual are available at https://sourceforge.net/projects/pardre/jgonzalezd@udc.es.

Publications

ParDRe: faster parallel duplicated reads removal tool for sequencing studies.
Cite this
González-Domínguez J, Schmidt B, 2016-05-01 - Bioinformatics (Oxford, England)

Credits

Jorge González-Domínguez
Developer
Grupo de Arquitectura de Computadores, Universidade da Coruña
Bertil Schmidt
Investigator
Parallel and Distributed Architectures Group, Johannes Gutenberg University Mainz, Germany

Community Ratings

Usability	Efficiency	Reliability	Rated By
			0 user
Sign in to rate

Summary

Accession	BT002331
Tool Type	Application
Category
Platforms	Linux/Unix
Technologies	C++
User Interface	Terminal Command Line
Download Count	0
Country/Region	Germany
Submitted By	Bertil Schmidt

ParDRe

Introduction

Publications

ParDRe: faster parallel duplicated reads removal tool for sequencing studies. Cite this

Credits

Community Ratings

ParDRe: faster parallel duplicated reads removal tool for sequencing studies.
Cite this