NextPolish2 Repeat-aware polishing genomes assembled using HiFi long reads

Introduction

Telomere-to-telomere (T2T) genome has been emerging as a new hotspot in the field of genomics. Typically, we obtain a T2T genome with datasets including both high-accuracy PacBio HiFi long reads and Oxford Nanopore Technologies (ONT) ultra-long reads. Although genomes obtained using HiFi long reads have considerably higher qualities, however, they still contain a handful of assembly errors in regions where HiFi long reads stumble as well, such as homopolymer or low-complexity microsatellite regions. Additionally, a typical gap-filling step is accomplished using ONT ultra long reads which contain a certain amount of errors. Hence, the current T2T genomes assembled still require further improvement in terms of consensus accuracy. NextPolish2 can be used to fix these errors (SNV/Indel) in a high quality assembly. Through the built-in phasing module, it can only correct the error bases while maintaining the original haplotype consistency. Therefore, even in the regions with complex repeat elements, NextPolish2 will still not produce overcorrections. In fact, in some cases it can reduce switching errors in the heterozygous region. NextPolish2 is not an upgraded version of NextPolish, but an additional supplement for the pursuit of extremely-high-quality genome assemblies.

Publications

No Publication Information

Credits

  1. Jiang Hu huj@grandomics.com
    Developer

    RD, Grandomics Biosciences Co.,Ltd., China

Community Ratings

UsabilityEfficiencyReliabilityRated By
1 users
Sign in to rate
huj***j@grandomics.com (October 25, 2023)
Summary
AccessionBT007383
Tool TypeApplication
CategoryError correction
PlatformsLinux/Unix, MAC OS X
Technologies
User InterfaceTerminal Command Line
Input DataBAM, FASTA, FASTQ, SAM
Latest Release0.2.0 (October 7, 2023)
Download Count98
Submitted ByJiang Hu