Haplotype-resolved genome assembly plays a crucial role in understanding allele-specific functions. However, obtaining haplotype-resolved assembly for auto-polyploid genomes remains challenging. Existing methods can be classified into reference-based phasing, assembly-based phasing, and gamete binning. Nevertheless, there is a lack of cost-effective and efficient methods for haplotyping auto-polyploid genomes. In this study, we propose a novel phasing algorithm called PolyGH, which combines Hi-C and gametic data. We conducted experiments on tetraploid potato cultivars and divided the method into three steps. Firstly, gametic data was utilized to bin non-collapsed contigs, followed by merging adjacent fragments of the same type within the same contig. Secondly, accurate Hi-C signals related to differential genomic regions were acquired using unique k-mers. Finally, collapsed fragments were assigned to haplotigs based on combined Hi-C and gametic signals. Comparing PolyGH with Hi-C-based and gametic data-based methods, we found that PolyGH exhibited superior performance in haplotyping auto-polyploid genomes when integrating both data types. This approach has the potential to enhance haplotype-resolved assembly for auto-polyploid genomes.
Nat Plants. 2019 Aug;5(8):833-845
[PMID:
31383970]
J Comput Biol. 2016 Sep;23(9):718-36
[PMID:
27280382]
Nat Genet. 2016 Jul;48(7):817-20
[PMID:
27270105]
J Comput Biol. 2015 Jun;22(6):498-509
[PMID:
25658651]
Nat Genet. 2022 Mar;54(3):342-348
[PMID:
35241824]
Nat Methods. 2012 Mar 04;9(4):357-9
[PMID:
22388286]
Bioinformatics. 2023 Jan 1;39(1):
[PMID:
36525368]
Bioinformatics. 2009 Aug 15;25(16):2078-9
[PMID:
19505943]
Bioinformatics. 2018 Sep 15;34(18):3094-3100
[PMID:
29750242]
Hortic Res. 2022 Dec 29;10(1):uhac288
[PMID:
37077372]
Genome Res. 2017 May;27(5):801-812
[PMID:
27940952]
Nat Genet. 2018 Nov;50(11):1565-1573
[PMID:
30297971]
Science. 2017 Apr 7;356(6333):92-95
[PMID:
28336562]
Genome Biol. 2020 Dec 29;21(1):306
[PMID:
33372615]
Nat Methods. 2021 Feb;18(2):170-175
[PMID:
33526886]
PLoS Comput Biol. 2019 Aug 21;15(8):e1007273
[PMID:
31433799]