Chromosome-level Genome Assembly of Korean Long-tailed Chicken and Pangenome of 40 Gallus gallus Assemblies.

Hanshin D Shin, Wonchoul Park, Han-Ha Chai, Youngho Lee, Jaehoon Jung, Byung June Ko, Heebal Kim
Author Information
  1. Hanshin D Shin: Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
  2. Wonchoul Park: Animal Genomics & Bioinformatics Division, National Institute of Animal Science, RDA 1500, Wanju, 55365, Republic of Korea.
  3. Han-Ha Chai: Animal Genomics & Bioinformatics Division, National Institute of Animal Science, RDA 1500, Wanju, 55365, Republic of Korea.
  4. Youngho Lee: Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea. ORCID
  5. Jaehoon Jung: Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea.
  6. Byung June Ko: Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea.
  7. Heebal Kim: Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea. heebal@snu.ac.kr. ORCID

Abstract

This study presents the first chromosome-level genome assembly of the Korean long-tailed chicken (KLC), a unique breed of Gallus gallus known as Ginkkoridak. Our assembly achieved a super contig N50 of 5.7 Mbp and a scaffold N50 exceeding 90���Mb, with a genome completeness of 96.3% as assessed by BUSCO using the aves_odb10 set. We also constructed a comprehensive pangenome graph, incorporating 40 Gallus gallus assemblies, including the KLC genome. This graph comprises 87,934,214 nodes, 121,720,974 edges, and a total sequence length of 1,709,850,352���bp. Notably, our KLC assembly contributed 1,919,925���bp of new sequences to the pangenome, underscoring the unique genetic makeup of this breed. Furthermore, in comparison with the pangenome, we identified 36,818 structural variants in KLC, which included 2,529 insertions, 27,743 deletions, and 6,546 of either insertions or deletions shorter than 1���kb. We also successfully identified pan-genome wide non-reference sequences. Our KLC assembly and pangenome graph provide valuable genomic resources for studying G. gallus populations.

References

  1. Curr Protoc Bioinformatics. 2004 May;Chapter 4:Unit 4.10 [PMID: 18428725]
  2. Mol Biol Evol. 2022 Apr 10;39(4): [PMID: 35325213]
  3. Bioinformatics. 2014 Aug 1;30(15):2114-20 [PMID: 24695404]
  4. BMC Biol. 2023 Nov 22;21(1):267 [PMID: 37993882]
  5. Nature. 2021 Apr;592(7856):737-746 [PMID: 33911273]
  6. Methods Mol Biol. 2019;1962:227-245 [PMID: 31020564]
  7. Bioinformatics. 2020 May 1;36(9):2896-2898 [PMID: 31971576]
  8. Nat Genet. 2017 Mar;49(3):387-394 [PMID: 28135246]
  9. Nature. 2023 May;617(7960):312-324 [PMID: 37165242]
  10. Sci Data. 2024 Mar 15;11(1):300 [PMID: 38490983]
  11. Gigascience. 2012 Dec 27;1(1):18 [PMID: 23587118]
  12. PLoS One. 2012;7(2):e30619 [PMID: 22312429]
  13. Genome Biol. 2016 Jun 06;17(1):122 [PMID: 27268795]
  14. Bioinformatics. 2014 Dec 15;30(24):3506-14 [PMID: 25165095]
  15. Proc Natl Acad Sci U S A. 2023 Feb 21;120(8):e2216641120 [PMID: 36780517]
  16. Genome Biol. 2022 Dec 15;23(1):258 [PMID: 36522651]
  17. iScience. 2023 Feb 18;26(3):106236 [PMID: 36915682]
  18. BMC Genomics. 2021 Aug 5;22(1):594 [PMID: 34348642]
  19. Gigascience. 2020 Sep 1;9(9): [PMID: 32893860]
  20. Nat Biotechnol. 2024 Apr;42(4):663-673 [PMID: 37165083]
  21. Bioinformatics. 2013 Apr 15;29(8):1072-5 [PMID: 23422339]
  22. Gigascience. 2018 Jul 1;7(7): [PMID: 30010758]
  23. Asian-Australas J Anim Sci. 2014 Oct;27(10):1399-405 [PMID: 25178290]
  24. Nat Methods. 2016 Dec;13(12):1050-1054 [PMID: 27749838]

Grants

  1. PJ013341/Rural Development Administration (RDA)

MeSH Term

Animals
Chickens
Genome
Chromosomes
Republic of Korea

Word Cloud

Created with Highcharts 10.0.0KLCassemblygalluspangenomegenomeGallusgraphKoreanuniquebreedN50also401sequencesidentifiedinsertionsdeletionsstudypresentsfirstchromosome-levellong-tailedchickenknownGinkkoridakachievedsupercontig57Mbpscaffoldexceeding90���Mbcompleteness963%assessedBUSCOusingaves_odb10setconstructedcomprehensiveincorporatingassembliesincludingcomprises87934214nodes121720974edgestotalsequencelength709850352���bpNotablycontributed919925���bpnewunderscoringgeneticmakeupFurthermorecomparison36818structuralvariantsincluded2529277436546eithershorter1���kbsuccessfullypan-genomewidenon-referenceprovidevaluablegenomicresourcesstudyingGpopulationsChromosome-levelGenomeAssemblyLong-tailedChickenPangenomeAssemblies

Similar Articles

Cited By

No available data.