Dense local haplotypes can now readily be extracted from long-read or droplet-based sequence data. However, these methods struggle to combine subchromosomal haplotype blocks into global chromosome-length haplotypes. Strand-seq is a single cell sequencing technique that uses read orientation to capture sparse global phase information by sequencing only one of two DNA strands for each parental homolog. In combination with dense local haplotypes from other technologies, Strand-seq data can be used to obtain complete chromosome-length phase information. In this chapter, we run the R package StrandPhaseR to phase SNVs using publicly available sequence data for sample HG005 of the Genome in a Bottle project.
Porubský D, Sanders AD, van Wietmarschen N et al (2016) Direct chromosome-length haplotyping by single-cell sequencing. Genome Res 26:1565–1574. https://doi.org/10.1101/gr.209841.116
[DOI: 10.1101/gr.209841.116]
Falconer E, Hills M, Naumann U et al (2012) DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution. Nat Methods 9:1107–1112. https://doi.org/10.1038/nmeth.2206
[DOI: 10.1038/nmeth.2206]
Porubský D (2017) Haplotype resolved genomes: computational challenges and applications. Dissertation, University of Groningen
van Wietmarschen N, Lansdorp PM (2016) Bromodeoxyuridine does not contribute to sister chromatid exchange events in normal or Bloom syndrome cells. Nucleic Acids Res 44:6787–6793. https://doi.org/10.1093/nar/gkw422
[DOI: 10.1093/nar/gkw422]
Porubsky D, Sanders AD, Taudt A et al (2020) breakpointR: an R/Bioconductor package to localize Strand state changes in Strand-seq data. Bioinformatics 36:1260–1261. https://doi.org/10.1093/bioinformatics/btz681
[DOI: 10.1093/bioinformatics/btz681]
Hanlon VCT, Chan DD, Hamadeh Z et al (2022) Construction of Strand-seq libraries in open nanoliter arrays. Cell Rep Methods 2(1):100150
[DOI: 10.1016/j.crmeth.2021.100150]
Porubsky D, Garg S, Sanders AD et al (2017) Dense and accurate whole-chromosome haplotyping of individual genomes. Nat Commun 8. https://doi.org/10.1038/s41467-017-01389-4
Wagner J, Olson ND, Harris L et al (2021) Benchmarking challenging small variants with linked and long reads. bioRxiv. https://doi.org/10.1101/2020.07.24.212712
Ebert P, Audano PA, Zhu Q et al (2021) Haplotype-resolved diverse human genomes and integrated analysis of structural variation. Science 372. https://doi.org/10.1126/science.abf7117
Sanders AD, Falconer E, Hills M et al (2017) Single-cell template Strand sequencing by Strand-seq enables the characterization of individual homologs. Nat Protoc 12:1151–1176. https://doi.org/10.1038/nprot.2017.029
[DOI: 10.1038/nprot.2017.029]
Weisenfeld NI, Kumar V, Shah P et al (2017) Direct determination of diploid genome sequences. Genome Res 27:757–767. https://doi.org/10.1101/gr.214874.116
[DOI: 10.1101/gr.214874.116]
Lin J-H, Chen L-C, Yu S-Q et al (2021) LongPhase: an ultra-fast chromosome-scale phasing algorithm for small and large variants. bioRxiv. https://doi.org/10.1101/2020.07.24.212712
Selvaraj S, Dixon RJ, Bansal V et al (2013) Whole-genome haplotype reconstruction using proximity-ligation and shotgun sequencing. Nat Biotechnol 31:1111–1118. https://doi.org/10.1038/nbt.2728
[DOI: 10.1038/nbt.2728]
Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079. https://doi.org/10.1093/bioinformatics/btp352
[DOI: 10.1093/bioinformatics/btp352]
Li H (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv. arxiv.org/abs/1303.3997
Bushnell B. BBTools software package, sourceforge.net/projects/bbmap /
Martin M, Patterson M, Garg S et al (2016) WhatsHap: fast and accurate read-based phasing. bioRxiv. https://doi.org/10.1101/085050
Gros C, Sanders AD, Korbel JO et al (2021) ASHLEYS: automated quality control for single-cell Strand-seq data. Bioinformatics 37:3356–3357. https://doi.org/10.1093/bioinformatics/btab221
[DOI: 10.1093/bioinformatics/btab221]
Kent WJ, Sugnet CW, Furey TS et al (2002) The human genome browser at UCSC. Genome Res 12:996–1006. https://doi.org/10.1101/gr.229102
[DOI: 10.1101/gr.229102]
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnetjournal 17(10–12). https://doi.org/10.14806/ej.17.1.200