Introduction

While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use.

Publications

  1. INTEGRATE: gene fusion discovery using whole genome and transcriptome data.
    Cite this
    Zhang J, White NM, Schmidt HK, Fulton RS, Tomlinson C, Warren WC, Wilson RK, Maher CA, 2016-01-01 - Genome research

Credits

  1. Jin Zhang
    Developer

    McDonnell Genome Institute, Washington University School of Medicine, United States of America

  2. Nicole M White
    Developer

    Department of Internal Medicine, Division of Oncology, United States of America

  3. Heather K Schmidt
    Developer

    McDonnell Genome Institute, Washington University School of Medicine, United States of America

  4. Robert S Fulton
    Developer

    McDonnell Genome Institute, Washington University School of Medicine, United States of America

  5. Chad Tomlinson
    Developer

    McDonnell Genome Institute, Washington University School of Medicine, United States of America

  6. Wesley C Warren
    Developer

    McDonnell Genome Institute, Washington University School of Medicine, United States of America

  7. Richard K Wilson
    Developer

    McDonnell Genome Institute, Washington University School of Medicine, United States of America

  8. Christopher A Maher
    Investigator

    McDonnell Genome Institute, Washington University School of Medicine, United States of America

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT000383
Tool TypeApplication
Category
PlatformsLinux/Unix
Technologies
User InterfaceTerminal Command Line
Download Count0
Country/RegionUnited States of America
Submitted ByChristopher A Maher