Introduction

The explosion of whole-genome sequencing (WGS) as a tool in the mapping and understanding of genomes has been accompanied by an equally massive report of tools and pipelines for the analysis of DNA copy number variation (CNV). Most currently available tools are designed specifically for human genomes, with comparatively little literature devoted to CNVs in prokaryotic organisms. However, there are several idiosyncrasies in prokaryotic WGS data. This work proposes a step-by-step approach for detection and quantification of copy number variants specifically aimed at prokaryotes.After aligning WGS reads to a reference genome, we count the individual reads in a sliding window and normalize these counts for bias introduced by differences in GC content. We then investigate the coverage in two fundamentally different ways: (i) Employing a Hidden Markov Model and (ii) by repeated sampling with replacement (bootstrapping) on each individual gene. The latter bypasses the complex problem of breakpoint determination. To demonstrate our method, we apply it to real and simulated WGS data and benchmark it against two popular methods for CNV detection. The proposed methodology will in some cases represent a significant jump in accuracy from other current methods.CNOGpro is written entirely in the R programming language and is available from the CRAN repository (http://cran.r-project.org) under the GNU General Public License.

Publications

  1. CNOGpro: detection and quantification of CNVs in prokaryotic whole-genome sequencing data.
    Cite this
    Brynildsrud O, Snipen LG, Bohlin J, 2015-06-01 - Bioinformatics (Oxford, England)

Credits

  1. Ola Brynildsrud
    Developer

    Section for Biostatistics and Epidemiology, Norwegian University of Life Sciences (NMBU), Norway

  2. Lars-Gustav Snipen
    Developer

    Section for Biostatistics and Epidemiology, Norwegian University of Life Sciences (NMBU), Norway

  3. Jon Bohlin
    Investigator

    Section for Biostatistics and Epidemiology, Norwegian University of Life Sciences (NMBU), Norway

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT006818
Tool TypeApplication
Category
PlatformsLinux/Unix
TechnologiesR
User InterfaceTerminal Command Line
Download Count0
Country/RegionNorway
Submitted ByJon Bohlin