Pseudogenome Suffix Array

Introduction

We propose a lightweight data structure for indexing and querying collections of NGS reads data in main memory. The data structure supports the interface proposed in the pioneering work by Philippe et al. for counting and locating k-mers in sequencing reads. Our solution, PgSA (pseudogenome suffix array), based on finding overlapping reads, is competitive to the existing algorithms in the space use, query times, or both. The main applications of our index include variant calling, error correction and analysis of reads from RNA-seq experiments.

Publications

Indexing Arbitrary-Length k-Mers in Sequencing Reads.
Cite this
Kowalski T, Grabowski S, Deorowicz S, 2015-01-01 - PloS one

Credits

Tomasz Kowalski
Developer
Institute of Applied Computer Science, Lodz University of Technology, Spain
Szymon Grabowski
Developer
Institute of Applied Computer Science, Lodz University of Technology, Spain
Sebastian Deorowicz
Investigator
Institute of Informatics, Silesian University of Technology, Poland

Community Ratings

Usability	Efficiency	Reliability	Rated By
			0 user
Sign in to rate

Summary

Accession	BT001178
Tool Type	Application
Category
Platforms	Linux/Unix
Technologies	C++
User Interface	Terminal Command Line
Download Count	0
Country/Region	Poland
Submitted By	Sebastian Deorowicz

Pseudogenome Suffix Array

Introduction

Publications

Indexing Arbitrary-Length k-Mers in Sequencing Reads. Cite this

Credits

Community Ratings

Indexing Arbitrary-Length k-Mers in Sequencing Reads.
Cite this