T2T-YAO: A Telomere-to-telomere Assembled Diploid Reference Genome for Han Chinese.

Yukun He, Yanan Chu, Shuming Guo, Jiang Hu, Ran Li, Yali Zheng, Xinqian Ma, Zhenglin Du, Lili Zhao, Wenyi Yu, Jianbo Xue, Wenjie Bian, Feifei Yang, Xi Chen, Pingan Zhang, Rihan Wu, Yifan Ma, Changjun Shao, Jing Chen, Jian Wang, Jiwei Li, Jing Wu, Xiaoyi Hu, Qiuyue Long, Mingzheng Jiang, Hongli Ye, Shixu Song, Guangyao Li, Yue Wei, Yu Xu, Yanliang Ma, Yanwen Chen, Keqiang Wang, Jing Bao, Wen Xi, Fang Wang, Wentao Ni, Moqin Zhang, Yan Yu, Shengnan Li, Yu Kang, Zhancheng Gao
Author Information
  1. Yukun He: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China.
  2. Yanan Chu: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China.
  3. Shuming Guo: Linfen Clinical Medicine Research Center, Linfen 041000, China; Institute of Chest and Lung Diseases, Shanxi Medical University, Taiyuan 030001, China.
  4. Jiang Hu: GrandOmics Biosciences Co., Ltd, Wuhan 430076, China.
  5. Ran Li: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  6. Yali Zheng: Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China.
  7. Xinqian Ma: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  8. Zhenglin Du: Institute of PSI Genomics, Wenzhou 325024, China.
  9. Lili Zhao: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  10. Wenyi Yu: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  11. Jianbo Xue: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  12. Wenjie Bian: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  13. Feifei Yang: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  14. Xi Chen: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  15. Pingan Zhang: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  16. Rihan Wu: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  17. Yifan Ma: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  18. Changjun Shao: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China.
  19. Jing Chen: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China.
  20. Jian Wang: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China.
  21. Jiwei Li: Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China.
  22. Jing Wu: Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China.
  23. Xiaoyi Hu: Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China.
  24. Qiuyue Long: Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China.
  25. Mingzheng Jiang: Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China.
  26. Hongli Ye: Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China.
  27. Shixu Song: Department of Respiratory, Critical Care and Sleep Medicine, Xiang'an Hospital of Xiamen University, School of Medicine, Xiamen University, Xiamen 361101, China.
  28. Guangyao Li: Linfen Clinical Medicine Research Center, Linfen 041000, China.
  29. Yue Wei: Linfen Clinical Medicine Research Center, Linfen 041000, China.
  30. Yu Xu: Beijing Jishuitan Hospital, Capital Medical University, Beijing 100035, China.
  31. Yanliang Ma: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  32. Yanwen Chen: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  33. Keqiang Wang: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  34. Jing Bao: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  35. Wen Xi: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  36. Fang Wang: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  37. Wentao Ni: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  38. Moqin Zhang: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  39. Yan Yu: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  40. Shengnan Li: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China.
  41. Yu Kang: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100490, China. Electronic address: kangy@big.ac.cn.
  42. Zhancheng Gao: Department of Respiratory and Critical Care Medicine, Peking University People's Hospital, Beijing 100044, China; Institute of Chest and Lung Diseases, Shanxi Medical University, Taiyuan 030001, China; Beijing Key Laboratory of Genome and Precision Medicine Technologies, Beijing 100101, China. Electronic address: zcgao@bjmu.edu.cn.

Abstract

Since its initial release in 2001, the human reference genome has undergone continuous improvement in quality, and the recently released telomere-to-telomere (T2T) version - T2T-CHM13 - reaches its highest level of continuity and accuracy after 20 years of effort by working on a simplified, nearly homozygous genome of a hydatidiform mole cell line. Here, to provide an authentic complete diploid human genome reference for the Han Chinese, the largest population in the world, we assembled the genome of a male Han Chinese individual, T2T-YAO, which includes T2T assemblies of all the 22 + X + M and 22 + Y chromosomes in both haploids. The quality of T2T-YAO is much better than those of all currently available diploid assemblies, and its haploid version, T2T-YAO-hp, generated by selecting the better assembly for each autosome, reaches the top quality of fewer than one error per 29.5 Mb, even higher than that of T2T-CHM13. Derived from an individual living in the aboriginal region of the Han population, T2T-YAO shows clear ancestry and potential genetic continuity from the ancient ancestors. Each haplotype of T2T-YAO possesses ∼ 330-Mb exclusive sequences, ∼ 3100 unique genes, and tens of thousands of nucleotide and structural variations as compared with CHM13, highlighting the necessity of a population-stratified reference genome. The construction of T2T-YAO, an accurate and authentic representative of the Chinese population, would enable precise delineation of genomic variations and advance our understandings in the hereditability of diseases and phenotypes, especially within the context of the unique variations of the Chinese population.

Keywords

References

  1. G3 (Bethesda). 2023 Mar 9;13(3): [PMID: 36630290]
  2. Nature. 2023 Sep;621(7978):344-354 [PMID: 37612512]
  3. Nature. 2003 Jun 19;423(6942):825-37 [PMID: 12815422]
  4. Algorithms Mol Biol. 2022 Mar 18;17(1):4 [PMID: 35303886]
  5. Genome Res. 2016 Apr;26(4):530-40 [PMID: 26934921]
  6. Science. 2022 Apr;376(6588):eabl4178 [PMID: 35357911]
  7. Nucleic Acids Res. 2023 Jan 6;51(D1):D942-D949 [PMID: 36420896]
  8. Science. 2022 Apr;376(6588):eabj6965 [PMID: 35357917]
  9. Chromosome Res. 2014 Dec;22(4):517-32 [PMID: 25179263]
  10. Nat Methods. 2018 Aug;15(8):595-597 [PMID: 30013044]
  11. Nature. 2023 Sep;621(7978):355-364 [PMID: 37612510]
  12. Am J Hum Genet. 2013 Aug 8;93(2):278-88 [PMID: 23910464]
  13. Genome Res. 2020 Sep;30(9):1291-1305 [PMID: 32801147]
  14. Nature. 2023 Jul;619(7968):112-121 [PMID: 37316654]
  15. Am J Hum Genet. 2005 Sep;77(3):408-19 [PMID: 16080116]
  16. Nature. 2021 May;593(7857):101-107 [PMID: 33828295]
  17. Nature. 2023 May;617(7960):335-343 [PMID: 37165241]
  18. Science. 2022 Apr;376(6588):eabj5089 [PMID: 35357915]
  19. Nat Biotechnol. 2012 Aug;30(8):771-6 [PMID: 22797562]
  20. Nat Methods. 2023 Mar;20(3):408-417 [PMID: 36658279]
  21. Clin Transl Med. 2015 Dec;4(1):60 [PMID: 26061870]
  22. Nat Methods. 2022 Jun;19(6):687-695 [PMID: 35361931]
  23. Nature. 2022 Nov;611(7936):519-531 [PMID: 36261518]
  24. Bioinformatics. 2021 Apr 1;36(22-23):5519-5521 [PMID: 33346817]
  25. Nat Methods. 2022 Jun;19(6):705-710 [PMID: 35365778]
  26. Nat Methods. 2018 Jun;15(6):461-468 [PMID: 29713083]
  27. Nat Biotechnol. 2023 Oct;41(10):1474-1482 [PMID: 36797493]
  28. Nature. 2023 May;617(7960):312-324 [PMID: 37165242]
  29. Science. 2022 Apr;376(6588):44-53 [PMID: 35357919]
  30. Nat Rev Mol Cell Biol. 2023 Jun;24(6):414-429 [PMID: 36732602]
  31. Science. 2022 Apr;376(6588):34-35 [PMID: 35357937]
  32. Genomics Proteomics Bioinformatics. 2019 Jun;17(3):229-247 [PMID: 31494266]
  33. PLoS One. 2014 Aug 29;9(8):e105691 [PMID: 25170956]
  34. Genome Biol. 2004;5(2):R12 [PMID: 14759262]
  35. Genome Biol. 2019 Dec 16;20(1):277 [PMID: 31842948]
  36. Nat Methods. 2021 Nov;18(11):1322-1332 [PMID: 34725481]
  37. Genome Biol. 2020 Sep 14;21(1):245 [PMID: 32928274]
  38. Nat Biotechnol. 2018 Nov;36(10):983-987 [PMID: 30247488]
  39. Genomics Proteomics Bioinformatics. 2021 Aug;19(4):584-589 [PMID: 34175476]
  40. Gigascience. 2021 Feb 16;10(2): [PMID: 33590861]
  41. Nat Methods. 2019 Jan;16(1):88-94 [PMID: 30559433]
  42. Nat Biotechnol. 2022 Jul;40(7):1075-1081 [PMID: 35228706]
  43. Nat Biotechnol. 2019 Oct;37(10):1155-1162 [PMID: 31406327]
  44. Genome Res. 2017 May;27(5):722-736 [PMID: 28298431]
  45. Microb Genom. 2023 Feb;9(2): [PMID: 36752781]
  46. Trends Genet. 2019 Oct;35(10):734-742 [PMID: 31395390]
  47. Fly (Austin). 2012 Apr-Jun;6(2):80-92 [PMID: 22728672]
  48. Nat Commun. 2020 Jun 1;11(1):2700 [PMID: 32483115]
  49. Nat Methods. 2022 Jun;19(6):696-704 [PMID: 35361932]
  50. Nature. 2020 Nov;587(7833):246-251 [PMID: 33177663]
  51. PLoS Comput Biol. 2019 Aug 21;15(8):e1007273 [PMID: 31433799]
  52. Science. 2001 Feb 16;291(5507):1304-51 [PMID: 11181995]
  53. Mol Biol Evol. 2017 Oct 1;34(10):2572-2582 [PMID: 28595347]
  54. Science. 2022 Apr;376(6588):eabk3112 [PMID: 35357925]
  55. Nature. 2015 Oct 1;526(7571):68-74 [PMID: 26432245]
  56. Genomics Proteomics Bioinformatics. 2021 Aug;19(4):578-583 [PMID: 34400360]
  57. Genome Res. 2014 Dec;24(12):2066-76 [PMID: 25373144]
  58. Curr Protoc Bioinformatics. 2003 Feb;Chapter 10:Unit 10.3 [PMID: 18428693]
  59. Trends Genet. 2023 May;39(5):381-400 [PMID: 36935218]
  60. Genome Res. 2021 Nov;31(11):1971-1982 [PMID: 34407983]
  61. Bioinformatics. 2021 Jul 19;37(12):1639-1643 [PMID: 33320174]
  62. Nat Biotechnol. 2018 Apr;36(4):338-345 [PMID: 29431738]
  63. Genomics. 2002 Jan;79(1):58-62 [PMID: 11827458]
  64. Cell Res. 2023 Oct;33(10):745-761 [PMID: 37452091]
  65. Nature. 2001 Feb 15;409(6822):860-921 [PMID: 11237011]
  66. Genome Res. 2012 Jun;22(6):1144-53 [PMID: 22399572]
  67. Fundam Res. 2022 Mar 02;2(6):946-953 [PMID: 38933383]
  68. Nature. 2004 Oct 21;431(7011):931-45 [PMID: 15496913]
  69. Genome Res. 2018 Jul;28(7):1029-1038 [PMID: 29884752]
  70. Genomics Proteomics Bioinformatics. 2024 May 9;22(1): [PMID: 38862426]
  71. Nature. 2023 May;617(7960):325-334 [PMID: 37165237]
  72. Science. 2022 Apr;376(6588):eabl3533 [PMID: 35357935]
  73. Bioinformatics. 2009 Aug 15;25(16):2078-9 [PMID: 19505943]

MeSH Term

Humans
Male
Asian People
China
Diploidy
East Asian People
Genome, Human
Haplotypes
Telomere

Word Cloud

Created with Highcharts 10.0.0genomeChineseHanT2T-YAOpopulationreferencequality+assemblyvariationshumanT2Tversion-T2T-CHM13reachescontinuityauthenticdiploidindividualassemblies22betteruniqueTelomere-to-telomereDiploidReferenceSinceinitialrelease2001undergonecontinuousimprovementrecentlyreleasedtelomere-to-telomerehighestlevelaccuracy20yearseffortworkingsimplifiednearlyhomozygoushydatidiformmolecelllineprovidecompletelargestworldassembledmaleincludesXMYchromosomeshaploidsmuchcurrentlyavailablehaploidT2T-YAO-hpgeneratedselectingautosometopfeweroneerrorper295 MbevenhigherDerivedlivingaboriginalregionshowsclearancestrypotentialgeneticancientancestorshaplotypepossesses330-Mbexclusivesequences3100genestensthousandsnucleotidestructuralcomparedCHM13highlightingnecessitypopulation-stratifiedconstructionaccuraterepresentativeenableprecisedelineationgenomicadvanceunderstandingshereditabilitydiseasesphenotypesespeciallywithincontextT2T-YAO:AssembledGenomeHaplotype-resolved

Similar Articles

Cited By