Introduction

Matched sequencing of both tumor and normal tissue is routinely used to classify variants of uncertain significance (VUS) into somatic vs. germline. However, assays used in molecular diagnostics focus on known somatic alterations in cancer genes and often only sequence tumors. Therefore, an algorithm that reliably classifies variants would be helpful for retrospective exploratory analyses. Contamination of tumor samples with normal cells results in differences in expected allelic fractions of germline and somatic variants, which can be exploited to accurately infer genotypes after adjusting for local copy number. However, existing algorithms for determining tumor purity, ploidy and copy number are not designed for unmatched short read sequencing data.We describe a methodology and corresponding open source software for estimating tumor purity, copy number, loss of heterozygosity (LOH), and contamination, and for classification of single nucleotide variants (SNVs) by somatic status and clonality. This R package, PureCN, is optimized for targeted short read sequencing data, integrates well with standard somatic variant detection pipelines, and has support for matched and unmatched tumor samples. Accuracy is demonstrated on simulated data and on real whole exome sequencing data.Our algorithm provides accurate estimates of tumor purity and ploidy, even if matched normal samples are not available. This in turn allows accurate classification of SNVs. The software is provided as open source (Artistic License 2.0) R/Bioconductor package PureCN (http://bioconductor.org/packages/PureCN/).

Publications

  1. PureCN: copy number calling and SNV classification using targeted short read sequencing.
    Cite this
    Riester M, Singh AP, Brannon AR, Yu K, Campbell CD, Chiang DY, Morrissey MP, 2016-01-01 - Source Code for Biology and Medicine

Credits

  1. Markus Riester
    Developer

    Novartis Institutes for BioMedical Research, Cambridge

  2. Angad P Singh
    Developer

    Novartis Institutes for BioMedical Research, Cambridge

  3. A Rose Brannon
    Developer

    Novartis Institutes for BioMedical Research, Cambridge

  4. Kun Yu
    Developer

    Novartis Institutes for BioMedical Research, Cambridge

  5. Catarina D Campbell
    Developer

    Novartis Institutes for BioMedical Research, Cambridge

  6. Derek Y Chiang
    Developer

    Novartis Institutes for BioMedical Research, Cambridge

  7. Michael P Morrissey
    Investigator

    Novartis Institutes for BioMedical Research, Cambridge

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT001299
Tool TypeApplication
Category
PlatformsLinux/Unix
TechnologiesR
User InterfaceTerminal Command Line
Download Count0
Submitted ByMichael P Morrissey