Introduction

We present a new approach to indel calling that explicitly exploits that indel differences between a reference and a sequenced sample make the mapping of reads less efficient. We assign all unmapped reads with a mapped partner to their expected genomic positions and then perform extensive de novo assembly on the regions with many unmapped reads to resolve homozygous, heterozygous, and complex indels by exhaustive traversal of the de Bruijn graph. The method is implemented in the software SOAPindel and provides a list of candidate indels with quality scores. We compare SOAPindel to Dindel, Pindel, and GATK on simulated data and find similar or better performance for short indels (<10 bp) and higher sensitivity and specificity for long indels. A validation experiment suggests that SOAPindel has a false-positive rate of ∼10% for long indels (>5 bp), while still providing many more candidate indels than other approaches.

Publications

  1. SOAPindel: efficient identification of indels from short paired reads.
    Cite this
    Li S, Li R, Li H, Lu J, Li Y, Bolund L, Schierup MH, Wang J, Wang J, 2013-01-01 - Genome research

Credits

  1. Shengting Li
    Developer

  2. Ruiqiang Li
    Developer

  3. Heng Li
    Developer

  4. Jianliang Lu
    Developer

  5. Yingrui Li
    Developer

  6. Lars Bolund
    Developer

  7. Mikkel H Schierup
    Developer

  8. Jun Wang
    Developer

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT000028
Tool TypeApplication
Category
PlatformsLinux/Unix
Technologies
User InterfaceTerminal Command Line
Download Count0
Submitted ByJun Wang