NextPolish2 Repeat-aware polishing genomes assembled using HiFi long reads
Introduction
Telomere-to-telomere (T2T) genome has been emerging as a new hotspot in the field of genomics. Typically, we obtain a T2T genome with datasets including both high-accuracy PacBio HiFi long reads and Oxford Nanopore Technologies (ONT) ultra-long reads. Although genomes obtained using HiFi long reads have considerably higher qualities, however, they still contain a handful of assembly errors in regions where HiFi long reads stumble as well, such as homopolymer or low-complexity microsatellite regions. Additionally, a typical gap-filling step is accomplished using ONT ultra long reads which contain a certain amount of errors. Hence, the current T2T genomes assembled still require further improvement in terms of consensus accuracy. NextPolish2 can be used to fix these errors (SNV/Indel) in a high quality assembly. Through the built-in phasing module, it can only correct the error bases while maintaining the original haplotype consistency. Therefore, even in the regions with complex repeat elements, NextPolish2 will still not produce overcorrections. In fact, in some cases it can reduce switching errors in the heterozygous region. NextPolish2 is not an upgraded version of NextPolish, but an additional supplement for the pursuit of extremely-high-quality genome assemblies.
Publications
No Publication Information
Credits
- Jiang Hu huj@grandomics.com Developer
RD, Grandomics Biosciences Co.,Ltd., China
Community Ratings
Usability | Efficiency | Reliability | Rated By |
---|---|---|---|
1 users | |||
Sign in to rate | |||
huj***j@grandomics.com (October 25, 2023) |
Accession | BT007383 |
---|---|
Tool Type | Application |
Category | Error correction |
Platforms | Linux/Unix, MAC OS X |
Technologies | |
User Interface | Terminal Command Line |
Input Data | BAM, FASTA, FASTQ, SAM |
Latest Release | 0.2.0 (October 7, 2023) |
Download Count | 98 |
Submitted By | Jiang Hu |