Introduction

Achieving complete, accurate, and cost-effective assembly of human genomes is of great importance for realizing the promise of precision medicine. The abundance of repeats and genetic variations in human genomes and the limitations of existing sequencing technologies call for the development of novel assembly methods that can leverage the complementary strengths of multiple technologies. We propose a Hybrid Structural variant Assembly (HySA) approach that integrates sequencing reads from next-generation sequencing and single-molecule sequencing technologies to accurately assemble and detect structural variants (SVs) in human genomes. By identifying homologous SV-containing reads from different technologies through a bipartite-graph-based clustering algorithm, our approach turns a whole genome assembly problem into a set of independent SV assembly problems, each of which can be effectively solved to enhance the assembly of structurally altered regions in human genomes. We used data generated from a haploid hydatidiform mole genome (CHM1) and a diploid human genome (NA12878) to test our approach. The result showed that, compared with existing methods, our approach had a low false discovery rate and substantially improved the detection of many types of SVs, particularly novel large insertions, small indels (10-50 bp), and short tandem repeat expansions and contractions. Our work highlights the strengths and limitations of current approaches and provides an effective solution for extending the power of existing sequencing technologies for SV discovery.

Publications

  1. HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies.
    Cite this
    Fan X, Chaisson M, Nakhleh L, Chen K, 2017-05-01 - Genome research

Credits

  1. Xian Fan
    Developer

    Department of Bioinformatics and Computational Biology, Division of Quantitative Sciences, United States of America

  2. Mark Chaisson
    Developer

    Department of Genome Sciences, University of Washington School of Medicine, United States of America

  3. Luay Nakhleh
    Developer

    Department of Computer Science, Rice University, United States of America

  4. Ken Chen
    Investigator

    Department of Bioinformatics and Computational Biology, Division of Quantitative Sciences, United States of America

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT006826
Tool TypeApplication
Category
PlatformsLinux/Unix
TechnologiesPerl
User InterfaceTerminal Command Line
Download Count0
Country/RegionUnited States of America
Submitted ByKen Chen