Introduction

Next-generation sequencing technologies have profoundly impacted biology over recent years. Experimental protocols, such as photoactivatable ribonucleoside-enhanced cross-linking and immunoprecipitation (PAR-CLIP), which identifies protein-RNA interactions on a genome-wide scale, commonly employ deep sequencing. With PAR-CLIP, the incorporation of photoactivatable nucleosides into nascent transcripts leads to high rates of specific nucleotide conversions during reverse transcription. So far, the specific properties of PAR-CLIP-derived sequencing reads have not been assessed in depth.We here compared PAR-CLIP sequencing reads to regular transcriptome sequencing reads (RNA-Seq) to identify distinctive properties that are relevant for reference-based read alignment of PAR-CLIP datasets. We developed a set of freely available tools for PAR-CLIP data analysis, called the PAR-CLIP analyzer suite (PARA-suite). The PARA-suite includes error model inference, PAR-CLIP read simulation based on PAR-CLIP specific properties, a full read alignment pipeline with a modified Burrows-Wheeler Aligner algorithm and CLIP read clustering for binding site detection.We show that differences in the error profiles of PAR-CLIP reads relative to regular transcriptome sequencing reads (RNA-Seq) make a distinct processing advantageous. We examine the alignment accuracy of commonly applied read aligners on 10 simulated PAR-CLIP datasets using different parameter settings and identified the most accurate setup among those read aligners. We demonstrate the performance of the PARA-suite in conjunction with different binding site detection algorithms on several real PAR-CLIP and HITS-CLIP datasets. Our processing pipeline allowed the improvement of both alignment and binding site detection accuracy.The PARA-suite toolkit and the PARA-suite aligner are available at https://github.com/akloetgen/PARA-suite and https://github.com/akloetgen/PARA-suite_aligner, respectively, under the GNU GPLv3 license.

Publications

  1. The PARA-suite: PAR-CLIP specific sequence read simulation and processing.
    Cite this
    Kloetgen A, Borkhardt A, Hoell JI, McHardy AC, 2016-01-01 - PeerJ

Credits

  1. Andreas Kloetgen
    Developer

    Department for Algorithmic Bioinformatics, Heinrich-Heine Universität Düsseldorf, Germany

  2. Arndt Borkhardt
    Developer

    Department of Pediatric Oncology, Hematology and Clinical Immunology, Germany

  3. Jessica I Hoell
    Developer

    Department of Pediatric Oncology, Hematology and Clinical Immunology, Germany

  4. Alice C McHardy
    Investigator

    Department for Algorithmic Bioinformatics, Heinrich-Heine Universität Düsseldorf, Germany

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT003701
Tool TypeApplication
Category
PlatformsLinux/Unix
TechnologiesC
User InterfaceTerminal Command Line
Download Count0
Country/RegionGermany
Submitted ByAlice C McHardy