Conference - 2023 Big Data Forum for Life and Health Sciences

October 18
09:00 - 09:10	Welcome and Opening Remarks, chaired by Zhang Zhang
Session 1, chaired by Zhang Zhang
09:10 - 10:10	What have we been thinking of 20 years after HGP? Jun Yu, Beijing Institute of Genomics, CAS / China National Center for Bioinformation
10:10 - 10:20	Group photo
Session 2, chaired by Lina Ma
10:30 - 11:00	Towards a sustainable global biodata infrastructure [Abstract] Life science and biomedical research have an absolute depency on data; little advance is made without the use of existing data or the generation of new data, and frequently researchers deploy a combination of these two. Biodata resources - the databases and services that allow scientists to access, share, curate, integrate and interpret biological data - exist as an ecosystem of more than 3,000 distinct but connected elements of a critical open data infrastructure in use continuously by scientists around the world. Unlike scientific infrastructures outside the life and biomedical sciences, though, these elements - the individual data resources - have emerged from the scientific community with no global coordination. While this brings advantages that are worthy, such as that users are closely connected to the design of data services, the lack of coordination brings a great fragility to individual data resources and a very real threat to their sustainability. The Global Biodata Coalition, a coalition of life science and biomedical research funding organisations came together to address issues of sustainability in the biodata resource infrastructure. An active programme of work includes exploration of the infrastructure landscape, selection processes for data resources with particularly broad global importance and expert discussions towards mechanisms that better support global cooperation between life science funding organisations. In the presentation I will outline the nature of biodata resources and the infrastructure that they make up, present the sustainability challenges and introduce the work of the GBC including its exploration of the global biodata resource landscape and Global Core Biodata Resource selection programme. Finally I will introduce the recently presented papers and ongoing consultation on open data strategies and sustainability of biodata resource. Guy Cochrane, EBI
11:00 - 11:30	Advancements in Nucleotide Sequencing and Genome Analysis: Updates from the DNA Data Bank of Japan [Abstract] The DNA Data Bank of Japan (DDBJ) is a public database of nucleotide sequences established at the Bioinformation and DDBJ Center of the National Institute of Genetics. Since 1987, DDBJ has been accepting annotated nucleotide sequences, issuing accession numbers, and distributing them in collaboration with the European Nucleotide Archive (ENA) at the European Bioinformatics Institute (EBI) and GenBank at the National Center for Biotechnology Information (NCBI). This collaborative framework is known as the International Nucleotide Sequence Database Collaboration (INSDC). In this presentation, I will report updates to the databases and services of the DDBJ Center and also provide updates on my laboratory's activities in analyzing genomes for various species. Yasukazu Nakamura, DDBJ
11:30 - 12:00	Database Resources of CNCB-NGDC Yiming Bao, Beijing Institute of Genomics, CAS / China National Center for Bioinformation
Session 3, chaired by Yiming Bao
13:30 - 14:00	Metagenomics: a tool to unveil new genomes in different environments[Abstract] Metagenomics has proven to be a powerful tool for identifying new organisms in different areas of biology and medicine. We will present results obtained in recent years. Ana Tereza Ribeiro de Vasconcelos, National Laboratory of Scientific Computation Bioinformatics Laboratory
14:00 - 14:30	IMGT®, the international ImMunoGeneTics information system®: current endeavors and future perspecrives[Abstract] IMGT®, known as the international ImMunoGeneTics information system®, has been a pioneering force in the fields of immunogenetics and immunoinformatics for over three decades. With a wealth of experience, IMGT® offers a comprehensive array of databases and tools to the scientific community, all centered around the adaptive immune response and built upon the IMGT-ONTOLOGY. Our primary focus is on the latest advancements within the IMGT® databases, tools, reference directories, and web resources, which revolve around three key aspects of IMGT® research and development. Axis I centers on unraveling the mysteries of the adaptive immune response by identifying and characterizing the genes responsible for immunoglobulins (IG) and T cell receptors (TR) in jawed vertebrates. This foundational axis serves as the cornerstone for the other two axes. Axis II delves into the analysis and exploration of expressed IG and TR repertoires, drawing upon comparisons with IMGT reference directories in both normal and pathological contexts. Axis III concentrates on scrutinizing amino acid modifications and the functionalities of 2D and 3D structures in antibody and TR engineering. In essence, IMGT® is dedicated to advancing our understanding of the adaptive immune response and providing invaluable resources and tools to further research and development in these critical fields. Sofia Kossida, University of Montpellier
14:30 - 14:50	Decline of gene function discovery in the midst of a biomedical publication tsunami[Abstract] Public depositories such as PUBMED meticulously register newly incoming life science papers. The recently observed scale of publishing is unprecedented in human history. Whereas there was an addition of ~8000 papers per annum until 2000, the annual quantum of new publications roughly tripled during the following two decades (~24000 papers per year) only to grow >25-fold (>200,000 papers per year) during the most recent years. Thus, drastically increased human and material resources are fed into the life science research sphere in many countries. First full genome sequencing of model organisms during 1995-2001 was thought to boost gene function and molecular mechanism discovery. Indeed, mechanistic insight is the precondition for rational medical intervention and efficient biotech applications. We have verified historical trends for E. coli, yeast and human. Whereas numerical details differ, the literature for all three model organisms shows the same trends: Compared with other groups of genes, the literature growth rates were highest for uncharacterized or understudied genes until late nineties of the twentieth century. Yet, these growth rates deteriorated and became negative thereafter. At the same time, literature for anyhow well-studied genes remains steadily growing and, sometimes, has reached insane levels. Did the early full genome sequencing of yeast boost gene function discovery? The data proves that the moment of publishing the full genome in reality coincides with the onset of decline of gene function discovery for previously uncharacterized genes (and with a global decrease of the return-on-investment in life science research). The points of sudden explosive literature growth about a given gene allow to estimate the minimal body of publications (and the corresponding amount of research investment) that needs to be accumulated to enable further low-risk, more incremental type of research involving this gene. We find that the size of this body depends on the complexity of the model organism. The number of years needed to accumulate this literature is only weakly dependent on technology development; thus, it mainly requires lots of human creativity. As neither a young team leader’s contract time nor the typical research grant size match the needs for function discovery of uncharacterized genes, it is no wonder that scientists shy away from these challenges. There is no evidence that publishing early reports about gene function discovery goes hand in hand with high journals’ impact factors; rather the opposite is supported by the data. Nevertheless, we found that a small group of scientific journals contributed extraordinarily to the literature relevant to gene function discoveries. Frank Eisenhaber, A*STAR
14:50 - 15:10	Predictive AI/ML: Shaping the Future of Antimicrobial Vaccines Amjad Ali, National University of Sciences and Technology
Session 4, chaired by Shuhui Song
15:20 - 15:50	The effects of chromatin organization on the inviability of interspecific Xenopus hybrids Xuemei Lu, Kunming Institute of Zoology, CAS
15:50 - 16:10	Epigenetic regulation of cell identity gene expression [Abstract] A cell type is determined by the network function of its cell identity genes, including master transcription factors that govern the expression status of this network. Cell identity transition is fundamental in normal differentiation and development, whereas a cell that loses its normal identity may cause disease including cancers. Targeting the driver genes for an abnormal cell identity holds great promise for new therapy. However, our understanding of cell identity regulators is incomplete. Integrating over >10,000 genomic and epigenomic profiles, we uncovered that cell identity genes as a unique group are distinct from other genes in the mechanisms to regulate their expression. These discoveries laid the foundation for us to develop novel machine learning techniques, which utilize expression regulation mechanisms for systematic identification of driver genes for normal cell differentiation and tumor development. These driver genes will lead to new therapeutic targets and diagnostic markers, as successfully verified in cell, xenograft, and PDX model for cancers, and thus, will benefit numerous patients. Kaifu Chen, Havard Medical School/Boston Children's Hospital[Bio] 陈开富博士从事计算生物学研究，专注于细胞分化与病变调控过程的计算建模。目前任哈佛大学医学院儿科系副教授、波士顿儿童医院心脏系计算生物学主任。本科毕业于南开大学。后于北京基因组研究所师从于军教授获得博士学位，并加入贝勒医学院邓肯癌症中心师从李蔚教授开展博士后研究。曾任康奈尔大学医学院助理教授及副教授，在休斯顿卫理公会研究所创建生物信息与计算生物学中心。在Nature、Science、Cell等杂志发表论文80余篇，开发计算生物学算法软件20余件。培养多位学生在哈佛大学、北京大学、康奈尔大学等著名院校获得教职。
16:10 - 16:30	AI-accelerated drug development in gene therapy Lijia Ma, Westlake University[Bio] 马丽佳博士于2009年获得生物信息学博士学位。她于2010年加入芝加哥大学担任博士后学者，后于2014年被聘任为研发科学家。马博士曾深入参与多个NIH的大型基因组学项目，包括ENCODE，modENCODE等。 2018年，马博士加入西湖大学，建立功能基因组学与系统生物学实验室，专注于开发高通量组学工具解析与编辑人类基因组。此外，她还是西湖云谷智药的科学创始人、董事会成员。西湖云谷智药是一家专注于开发AI加速的基因编辑疗法平台技术和研发先进基因治疗产品的生物技术初创公司。
16:30 - 18:00	NGDC战略研讨（11层1103会议室）所领导、与会专家

October 18

09:00 - 09:10

Welcome and Opening Remarks, chaired by Zhang Zhang

Session 1, chaired by Zhang Zhang

09:10 - 10:10

What have we been thinking of 20 years after HGP?

Jun Yu, Beijing Institute of Genomics, CAS / China National Center for Bioinformation

10:10 - 10:20

Group photo

Session 2, chaired by Lina Ma

10:30 - 11:00

Towards a sustainable global biodata infrastructure [Abstract]

Life science and biomedical research have an absolute depency on data; little advance is made without the use of existing data or the generation of new data, and frequently researchers deploy a combination of these two. Biodata resources - the databases and services that allow scientists to access, share, curate, integrate and interpret biological data - exist as an ecosystem of more than 3,000 distinct but connected elements of a critical open data infrastructure in use continuously by scientists around the world. Unlike scientific infrastructures outside the life and biomedical sciences, though, these elements - the individual data resources - have emerged from the scientific community with no global coordination. While this brings advantages that are worthy, such as that users are closely connected to the design of data services, the lack of coordination brings a great fragility to individual data resources and a very real threat to their sustainability. The Global Biodata Coalition, a coalition of life science and biomedical research funding organisations came together to address issues of sustainability in the biodata resource infrastructure. An active programme of work includes exploration of the infrastructure landscape, selection processes for data resources with particularly broad global importance and expert discussions towards mechanisms that better support global cooperation between life science funding organisations. In the presentation I will outline the nature of biodata resources and the infrastructure that they make up, present the sustainability challenges and introduce the work of the GBC including its exploration of the global biodata resource landscape and Global Core Biodata Resource selection programme. Finally I will introduce the recently presented papers and ongoing consultation on open data strategies and sustainability of biodata resource.

Guy Cochrane, EBI

11:00 - 11:30

Advancements in Nucleotide Sequencing and Genome Analysis: Updates from the DNA Data Bank of Japan [Abstract]

The DNA Data Bank of Japan (DDBJ) is a public database of nucleotide sequences established at the Bioinformation and DDBJ Center of the National Institute of Genetics. Since 1987, DDBJ has been accepting annotated nucleotide sequences, issuing accession numbers, and distributing them in collaboration with the European Nucleotide Archive (ENA) at the European Bioinformatics Institute (EBI) and GenBank at the National Center for Biotechnology Information (NCBI). This collaborative framework is known as the International Nucleotide Sequence Database Collaboration (INSDC). In this presentation, I will report updates to the databases and services of the DDBJ Center and also provide updates on my laboratory's activities in analyzing genomes for various species.

Yasukazu Nakamura, DDBJ

11:30 - 12:00

Database Resources of CNCB-NGDC

Yiming Bao, Beijing Institute of Genomics, CAS / China National Center for Bioinformation

Session 3, chaired by Yiming Bao

13:30 - 14:00

Metagenomics: a tool to unveil new genomes in different environments[Abstract]

Metagenomics has proven to be a powerful tool for identifying new organisms in different areas of biology and medicine. We will present results obtained in recent years.

Ana Tereza Ribeiro de Vasconcelos, National Laboratory of Scientific Computation Bioinformatics Laboratory

14:00 - 14:30

IMGT®, the international ImMunoGeneTics information system®: current endeavors and future perspecrives[Abstract]

IMGT®, known as the international ImMunoGeneTics information system®, has been a pioneering force in the fields of immunogenetics and immunoinformatics for over three decades. With a wealth of experience, IMGT® offers a comprehensive array of databases and tools to the scientific community, all centered around the adaptive immune response and built upon the IMGT-ONTOLOGY.

Our primary focus is on the latest advancements within the IMGT® databases, tools, reference directories, and web resources, which revolve around three key aspects of IMGT® research and development.

Axis I centers on unraveling the mysteries of the adaptive immune response by identifying and characterizing the genes responsible for immunoglobulins (IG) and T cell receptors (TR) in jawed vertebrates. This foundational axis serves as the cornerstone for the other two axes.

Axis II delves into the analysis and exploration of expressed IG and TR repertoires, drawing upon comparisons with IMGT reference directories in both normal and pathological contexts.

Axis III concentrates on scrutinizing amino acid modifications and the functionalities of 2D and 3D structures in antibody and TR engineering.

In essence, IMGT® is dedicated to advancing our understanding of the adaptive immune response and providing invaluable resources and tools to further research and development in these critical fields.

Sofia Kossida, University of Montpellier

14:30 - 14:50

Decline of gene function discovery in the midst of a biomedical publication tsunami[Abstract]

Public depositories such as PUBMED meticulously register newly incoming life science papers. The recently observed scale of publishing is unprecedented in human history. Whereas there was an addition of ~8000 papers per annum until 2000, the annual quantum of new publications roughly tripled during the following two decades (~24000 papers per year) only to grow >25-fold (>200,000 papers per year) during the most recent years. Thus, drastically increased human and material resources are fed into the life science research sphere in many countries.

First full genome sequencing of model organisms during 1995-2001 was thought to boost gene function and molecular mechanism discovery. Indeed, mechanistic insight is the precondition for rational medical intervention and efficient biotech applications. We have verified historical trends for E. coli, yeast and human. Whereas numerical details differ, the literature for all three model organisms shows the same trends: Compared with other groups of genes, the literature growth rates were highest for uncharacterized or understudied genes until late nineties of the twentieth century. Yet, these growth rates deteriorated and became negative thereafter. At the same time, literature for anyhow well-studied genes remains steadily growing and, sometimes, has reached insane levels.

Did the early full genome sequencing of yeast boost gene function discovery? The data proves that the moment of publishing the full genome in reality coincides with the onset of decline of gene function discovery for previously uncharacterized genes (and with a global decrease of the return-on-investment in life science research).

The points of sudden explosive literature growth about a given gene allow to estimate the minimal body of publications (and the corresponding amount of research investment) that needs to be accumulated to enable further low-risk, more incremental type of research involving this gene. We find that the size of this body depends on the complexity of the model organism. The number of years needed to accumulate this literature is only weakly dependent on technology development; thus, it mainly requires lots of human creativity. As neither a young team leader’s contract time nor the typical research grant size match the needs for function discovery of uncharacterized genes, it is no wonder that scientists shy away from these challenges.

There is no evidence that publishing early reports about gene function discovery goes hand in hand with high journals’ impact factors; rather the opposite is supported by the data. Nevertheless, we found that a small group of scientific journals contributed extraordinarily to the literature relevant to gene function discoveries.

Frank Eisenhaber, A*STAR

14:50 - 15:10

Predictive AI/ML: Shaping the Future of Antimicrobial Vaccines

Amjad Ali, National University of Sciences and Technology

Session 4, chaired by Shuhui Song

15:20 - 15:50

The effects of chromatin organization on the inviability of interspecific Xenopus hybrids

Xuemei Lu, Kunming Institute of Zoology, CAS

15:50 - 16:10

Epigenetic regulation of cell identity gene expression [Abstract]

A cell type is determined by the network function of its cell identity genes, including master transcription factors that govern the expression status of this network. Cell identity transition is fundamental in normal differentiation and development, whereas a cell that loses its normal identity may cause disease including cancers. Targeting the driver genes for an abnormal cell identity holds great promise for new therapy. However, our understanding of cell identity regulators is incomplete. Integrating over >10,000 genomic and epigenomic profiles, we uncovered that cell identity genes as a unique group are distinct from other genes in the mechanisms to regulate their expression. These discoveries laid the foundation for us to develop novel machine learning techniques, which utilize expression regulation mechanisms for systematic identification of driver genes for normal cell differentiation and tumor development. These driver genes will lead to new therapeutic targets and diagnostic markers, as successfully verified in cell, xenograft, and PDX model for cancers, and thus, will benefit numerous patients.

Kaifu Chen, Havard Medical School/Boston Children's Hospital[Bio]

陈开富博士从事计算生物学研究，专注于细胞分化与病变调控过程的计算建模。目前任哈佛大学医学院儿科系副教授、波士顿儿童医院心脏系计算生物学主任。本科毕业于南开大学。后于北京基因组研究所师从于军教授获得博士学位，并加入贝勒医学院邓肯癌症中心师从李蔚教授开展博士后研究。曾任康奈尔大学医学院助理教授及副教授，在休斯顿卫理公会研究所创建生物信息与计算生物学中心。在Nature、Science、Cell等杂志发表论文80余篇，开发计算生物学算法软件20余件。培养多位学生在哈佛大学、北京大学、康奈尔大学等著名院校获得教职。

16:10 - 16:30

AI-accelerated drug development in gene therapy

Lijia Ma, Westlake University[Bio]

马丽佳博士于2009年获得生物信息学博士学位。她于2010年加入芝加哥大学担任博士后学者，后于2014年被聘任为研发科学家。马博士曾深入参与多个NIH的大型基因组学项目，包括ENCODE，modENCODE等。 2018年，马博士加入西湖大学，建立功能基因组学与系统生物学实验室，专注于开发高通量组学工具解析与编辑人类基因组。此外，她还是西湖云谷智药的科学创始人、董事会成员。西湖云谷智药是一家专注于开发AI加速的基因编辑疗法平台技术和研发先进基因治疗产品的生物技术初创公司。

16:30 - 18:00

NGDC战略研讨（11层1103会议室）

所领导、与会专家

October 19
Session 1, chaired by Jingfa Xiao
09:00 - 09:30	A Database of Prokaryotic Global Regulators Based on Machine Learning 基于机器学习的微生物全局调控因子数据库 Songnian Hu, Institute of Microbiology, CAS
09:30 - 10:00	Transcription factors binding syntax in mammalian genomes [Abstract] Regulatory elements in mammalian genomes activate promoters by recruiting transcription factors (TFs) to DNA motifs. Although motif arrangement is critical to their function, the syntax behind the cis-regulatory code remains undefined. We here explore this question in 400 mouse and human primary cell types by combining an atlas of TF motifs with transcriptome, accessibility, and footprinting maps. We find that TFs are recruited to either low density elements with few partners or they occupy overcrowded sequences. Furthermore, TF motifs are either centrally positioned or depleted from enhancer mid-points, a feature that is evolutionarily conserved and reflect known TF activities. In addition, based on paired-wise TF colocalization map, we uncover two TF groups that colocalize with most expressed factors, forming stripes in hierarchical clustering maps. First are lineage-determining factors that occupy DNA elements broadly, consistent with their key role in tissue-specific transcription. The second group, dubbed universal stripe factors (USFs), comprise ~30 SP, KLF, EGR, and ZBTB family members that recognize overlapping GC-rich sequences in all tissues analyzed. Knockouts and single molecule tracking reveal that USFs impart accessibility to colocalized partners and increase their residence time. Mammalian cells have thus evolved a superfamily of TFs with overlapping binding that facilitate promoter-enhancer accessibility. Yongbing Zhao, Guangzhou Institutes of Biomedicine and Health, CAS[Bio] 赵永兵博士，本科毕业于华中科技大学，在中国科学院北京基因组研究所获得博士学位。目前担任中国科学院广州生物医药与健康研究院研究员。主要利用深度学习、生物信息学和分子生物学等实验技术和方法，研究细胞命运决定过程中的基因转录调控规则和机制。以第一或通讯作者(含共同)在Molecular Cell、Science Advances、Nucleic Acids Research等学术期刊上发表论文11篇。
10:00 - 10:30	Buffalo genome and functional gene investigation 水牛基因组与功能基因挖掘 Qingyou Liu, Foshan University
Session 2, chaired by Songnian Hu
10:40 - 11:10	Role of RNA m6A methyltransferase METTL3 in promoting medulloblastoma progression RNA m6A甲基转移酶METTL3在髓母细胞瘤中的促癌作用研究 [Abstract] 髓母细胞瘤(Medulloblastoma, MB) 是最常见的儿童颅内恶性肿瘤，为开发有效的靶向治疗方法，需要全面深入地了解其发病机理。 RNA m6A甲基化（m6A）在RNA代谢过程中发挥重要的调控作用，一旦其平衡紊乱则会导致肿瘤等疾病的发生。 SHH亚型的髓母细胞瘤(SHH MB)的发生源于小脑颗粒神经元的过度增殖，而m6A在小脑发育过程中调控颗粒神经元的增殖，但是m6A在髓母细胞瘤的发生发展过程中是否发挥一定的作用，迄今尚未见报道。我们发现m6A甲基转移酶METTL3在RNA与蛋白水平上均表达异常升高、而且表达水平和病人预后呈负相关。而通过SHH MB肿瘤样本的m6A-seq分析发现肿瘤中m6A高甲基化的基因远远多于对照样本。继而，在细胞水平上证明METTL3可以调节细胞增殖、迁移能力。结合测序和实验验证鉴定出METTL3主要的靶基因为PTCH1和GLI2 RNAs，发现METTL3可通过m6A甲基化调节两种RNA的降解速率和GLI2的翻译效率，从而调节SHH通路的活性。进一步通过裸鼠小脑原位成瘤实验和小分子抑制剂实验，证明抑制METTL3表达可减缓SHH亚型肿瘤进展。综上，本研究发现揭示了SHH MB发生的表观转录调控机制，并为开发针对SHHMB的治疗靶点提供了新的思路与理论依据。 Yamei Niu, Chinese Academy of Medical Sciences[Bio] 中国医学科学院基础医学研究所研究员、博士生导师，协和侨联秘书长。日本长崎大学药学博士，法国WHO国际癌症研究署及日本久留米大学医学部博士后。主要研究方向为针对脑肿瘤与神经退行性病变等疾病，在分子、细胞及整体实验动物水平上系统性研究表观转录标记RNA m6A修饰在上述疾病中的作用与机制，为脑重大疾病的预警、诊断或治疗提供理论基础。近年来主要研究成果发表在Cell Reports、Cell Discovery、Genome Biology、Molecular Cell与 Genomics, Proteomics & Bioinformatics等国际期刊。
11:10 - 11:40	An evolutionary and genomic approach for dissecting ethnic differences in hepatocellular carcinoma carcinogenesis and treatment 利用进化和基因组学手段探究肝癌的种族特异性以及治疗方案 [Abstract] 肿瘤异质性(Intra-tumor heterogeneity, ITH)是肿瘤进化和治疗的基础，探究肿瘤异质性的起源和进化具有重要的科学意义。随着研究的深入，一个异质性的新层次：癌症的群体间差异（i.e.癌症的种族特异性）快速涌现。在这个报告里，我将以肝癌为研究对象，探究肝癌的种族特异性，并在特异性的基础上，探索如何利用放射加免疫组合治疗方案，针对不同群体设计精准治疗方案。希望通过这个报告，展示如何在解析癌症的种族特异性的同时，构建针对不同族裔的精准治疗方案。 Weiwei Zhai, Institute of Zoology, CAS[Bio] 翟巍巍，中国科学院动物研究所研究员，国家引才计划青年项目获得者。主要研究领域为群体遗传学。他以癌症细胞群体为主要研究对象，围绕着肿瘤异质性的起源和进化,在癌症种族特异性和肿瘤的空间异质性等重要的科学问题上取得了系列性的进展。在Nature Genetics等杂志上发表文章60多篇。曾获美国AAAS Newcomb Cleveland prize (Science杂志2010年最佳论文)、2019年中国医药生物技术十大进展等奖项。现任进化领域核心期刊Molecular Biology and Evolution、中国学术期刊National Science Review编委。
11:40 - 12:00	Molecular Mechanisms of B-Cell Acute Lymphoblastic Leukemia Relapse Under CART Cell Therapy CART细胞治疗下B细胞急性白血病复发的分子机制研究 Fuhong He, Beijing Institute of Genomics, CAS / China National Center for Bioinformation
Session 3, chaired by Xin Sheng
13:30 - 14:00	Applications of single-cell sequencing in disease and development research 单细胞测序技术在疾病发生和细胞发育研究中的应用 [Abstract] 随着测序技术的进步和测序成本的下降，单细胞测序技术现已被广泛应用于基础科研和临床研究中。其中单细胞转录组测序技术（scRNA-Seq）给我们提供了一个高分辨率探究机体中每个细胞功能的方法，是细胞异质性，细胞发育和分化，疾病发生等相关研究的利器。本专题结合本课题组进行的几个单细胞转录组学分析为主的项目，系统展示单细胞转录组测序技术在相关领域的应用。项目一，先对体外扩增的人软骨细胞进行Il-1b诱导模拟炎症环境，然后进行单细胞测序，分析显示扩增的人软骨细胞（chondrocyte）本身具有细胞异质性，在炎性刺激下，沿着炎性反应和非炎性反应两条路径进行分化。项目二，我们重分析已经发表的主动脉瓣钙化（CAVD）单细胞转录组学数据，鉴定了多种未见报道细胞亚型，同时发现一类特殊的主动脉瓣间质细胞（VIC）可分泌信号分子MDK从而抑制其他VIC的钙化，进一步完善了CAVD发病机制的模型。项目三，我们通过对不同发育时期的小鼠心脏进行单细胞转录组测序和分析，鉴定了一群特殊的单核心肌细胞，为心肌细胞增殖与分化提供新的视角。这些项目的开展，显示出单细胞转录组学技术具有广泛的研究适用对象，在相关研究中的起着重要作用。 Ximiao He, Huazhong University of Science & Technology[Bio] 何西淼博士长期从事表观基因组学、生物信息学与精准医学研究。现任华中科技大学教授，同济医学院基础医学院生理学系主任，基因组与蛋白质组研究中心主任。海外引进高层次人才。本科毕业于同济医科大学，后于北京基因组研究所师从于军教授获基因组学博士学位，并于美国国立卫生研究院国家癌症研究所历任博士后， Research Fellow。已在Genome Research、NAR等期刊上发表论文50余篇。国际计算生物学协会(ISCB)与国际人类基因组组织(HUGO)成员。武汉市生理学会理事长；湖北省生理学会常务理事。 Evolutionary Bioinformatics副主编；Current Medical Science编委。
14:00 - 14:30	Multiomics research in Center for Life Sciences, NLA, NU: directions and challenges Ainur Akilzhanova, National Laboratory Astana, Nazarbayev University, Kazakhstan Ulykbek Kairov, National Laboratory Astana, Nazarbayev University, Kazakhstan
14:30 - 15:00	CHD6 Promotes Broad Nucleosome Eviction for Transcriptional Activation of oncogenes and cell identity [Abstract] Despite being a member of the chromodomain helicase DNA-binding protein family, little is known about the exact role of CHD6 in chromatin remodeling or cancer disease. Here we show that CHD6 binds to chromatin to promote broad nucleosome eviction for transcriptional activation of many cancer pathways. By integrating multiple patient cohorts for bioinformatics analysis of over a thousand prostate cancer datasets, we found CHD6 expression elevated in prostate cancer and associated with poor prognosis. Further comprehensive experiments demonstrated that CHD6 regulates oncogenicity of prostate cancer cells and tumor development in a murine xenograft model. ChIP-Seq for CHD6, along with MNase-Seq and RNA-Seq, revealed that CHD6 binds on chromatin to evict nucleosomes from promoters and gene bodies for transcriptional activation of oncogenic pathways. These results demonstrated a key function of CHD6 in evicting nucleosomes from chromatin for transcriptional activation of prostate cancer pathways. Furthermore, we are studying the mechanism of CHD6 binding to cell identity genes. Dongyu Zhao, Peking University[Bio] 赵东宇北京大学基础医学院医学生物信息学系执行主任、研究员、博士生导师，血管稳态与重构全国重点实验室PI，国家高层次海外优秀青年人才，北京大学博雅青年学者。中国运筹学会计算系统生物学分会青年理事，中国抗癌协会肿瘤测序及大数据分析专委会委员，中国生物信息学学会表观遗传信息学专委会/系统生物学专委会委员，北京生物信息学研究会会员。 2013年博士毕业于中科院北京基因组研究所，师从于军教授。之后在美国休斯敦Methodist 研究所和哈佛医学院陈开富教授实验室做博士后研究。2021年初回国加入北京大学。研究方向：疾病基因。相关工作以第一/通讯发表在Molecular Cell，Nature Communications（3篇）, Nucleic Acids Research, Oncogene, Cell proliferation, Iscience等国际期刊。
Session 4, chaired by Ximiao He
15:10 - 15:40	Mapping the genetic architecture of human traits to cell types in the kidney identifies mechanisms of disease and potential treatments [Abstract] The functional interpretation of genome-wide association studies (GWAS) is challenging due to the cell-type-dependent influences of genetic variants. Here, we generated comprehensive maps of expression quantitative trait loci (eQTLs) for 659 microdissected human kidney samples and identified cell-type-eQTLs by mapping interactions between cell type abundances and genotypes. By partitioning heritability using stratified linkage disequilibrium score regression to integrate GWAS with single-cell RNA sequencing and single-nucleus assay for transposase-accessible chromatin with high-throughput sequencing data, we prioritized proximal tubules for kidney function and endothelial cells and distal tubule segments for blood pressure pathogenesis. Bayesian colocalization analysis nominated more than 200 genes for kidney function and hypertension. Our study clarifies the mechanism of commonly used antihypertensive and renal-protective drugs and identifies drug repurposing opportunities for kidney disease. Xin Sheng, Zhejiang University[Bio] 盛欣，浙江大学“百人计划”研究员，博士生导师。致力于基于遗传学研究复杂疾病的发病机制。在Nature Genetics等期刊发表论文29篇，累计影响因子大于460分，近五年被Cell、Nature、New England Journal of Medicine等杂志引用超过1700次，其中19篇属于高被引论文（h指数=19，i10指数=24），已发表一作（含共一）或通讯论文7篇。承担杭州海外高层次人才计划及国家自然科学基金项目等任务，曾多次在美国肾脏病年会（ASN）等大型国际会议上做特邀口头学术报告。
15:40 - 16:10	Human gut microbiome researches over the last decade: current challenges and future directions[Abstract] Despite the rapid advances in gut microbiome research, there remain many challenges that need to be addressed. Here we will discuss eight key challenges that we are currently facing, with the aim of shedding light on future research directions. Hao Wu, Fudan University[Bio] 吴浩，复旦大学人类表型组研究院及复旦微生物组中心，青年研究员，博士生导师；复旦大学附属华山医院代谢与减重外科双聘专家；上海市浦江人才，国家海外高层次人才。兼任生物物理学会肠道菌群分会委员、代谢生物学分会青年委员；生物化学与分子生物学会代谢专业委员会委员；微生物学会生物组专业委员会委员；人体健康科技促进会肠道微生态与肠菌移植专业委员会委员。主要研究肠道菌与饮食互作对肥胖、非酒精性脂肪肝及糖尿病等代谢性疾病发生发展及防治的影响，以第一或通讯作者身份在Nature Medicine，Nature Communications， Cell Metabolism，EMBO molecular medicine， The ISME J等杂志发表多篇学术论文。
16:10 - 16:30	敏捷分析加速生命科学创新联想公司
16:30 - 16:40	大模型时代下的科研算力基础设施建设浪潮公司
16:40 - 18:00	产学研专题研讨（11层1103会议室）专家、企业代表

October 19

Session 1, chaired by Jingfa Xiao

09:00 - 09:30

A Database of Prokaryotic Global Regulators Based on Machine Learning
基于机器学习的微生物全局调控因子数据库

Songnian Hu, Institute of Microbiology, CAS

09:30 - 10:00

Transcription factors binding syntax in mammalian genomes [Abstract]

Regulatory elements in mammalian genomes activate promoters by recruiting transcription factors (TFs) to DNA motifs. Although motif arrangement is critical to their function, the syntax behind the cis-regulatory code remains undefined. We here explore this question in 400 mouse and human primary cell types by combining an atlas of TF motifs with transcriptome, accessibility, and footprinting maps. We find that TFs are recruited to either low density elements with few partners or they occupy overcrowded sequences. Furthermore, TF motifs are either centrally positioned or depleted from enhancer mid-points, a feature that is evolutionarily conserved and reflect known TF activities. In addition, based on paired-wise TF colocalization map, we uncover two TF groups that colocalize with most expressed factors, forming stripes in hierarchical clustering maps. First are lineage-determining factors that occupy DNA elements broadly, consistent with their key role in tissue-specific transcription. The second group, dubbed universal stripe factors (USFs), comprise ~30 SP, KLF, EGR, and ZBTB family members that recognize overlapping GC-rich sequences in all tissues analyzed. Knockouts and single molecule tracking reveal that USFs impart accessibility to colocalized partners and increase their residence time. Mammalian cells have thus evolved a superfamily of TFs with overlapping binding that facilitate promoter-enhancer accessibility.

Yongbing Zhao, Guangzhou Institutes of Biomedicine and Health, CAS[Bio]

赵永兵博士，本科毕业于华中科技大学，在中国科学院北京基因组研究所获得博士学位。目前担任中国科学院广州生物医药与健康研究院研究员。主要利用深度学习、生物信息学和分子生物学等实验技术和方法，研究细胞命运决定过程中的基因转录调控规则和机制。以第一或通讯作者(含共同)在Molecular Cell、Science Advances、Nucleic Acids Research等学术期刊上发表论文11篇。

10:00 - 10:30

Buffalo genome and functional gene investigation 水牛基因组与功能基因挖掘

Qingyou Liu, Foshan University

Session 2, chaired by Songnian Hu

10:40 - 11:10

Role of RNA m6A methyltransferase METTL3 in promoting medulloblastoma progression
RNA m6A甲基转移酶METTL3在髓母细胞瘤中的促癌作用研究 [Abstract]

髓母细胞瘤(Medulloblastoma, MB) 是最常见的儿童颅内恶性肿瘤，为开发有效的靶向治疗方法，需要全面深入地了解其发病机理。 RNA m6A甲基化（m6A）在RNA代谢过程中发挥重要的调控作用，一旦其平衡紊乱则会导致肿瘤等疾病的发生。 SHH亚型的髓母细胞瘤(SHH MB)的发生源于小脑颗粒神经元的过度增殖，而m6A在小脑发育过程中调控颗粒神经元的增殖，但是m6A在髓母细胞瘤的发生发展过程中是否发挥一定的作用，迄今尚未见报道。我们发现m6A甲基转移酶METTL3在RNA与蛋白水平上均表达异常升高、而且表达水平和病人预后呈负相关。而通过SHH MB肿瘤样本的m6A-seq分析发现肿瘤中m6A高甲基化的基因远远多于对照样本。继而，在细胞水平上证明METTL3可以调节细胞增殖、迁移能力。结合测序和实验验证鉴定出METTL3主要的靶基因为PTCH1和GLI2 RNAs，发现METTL3可通过m6A甲基化调节两种RNA的降解速率和GLI2的翻译效率，从而调节SHH通路的活性。进一步通过裸鼠小脑原位成瘤实验和小分子抑制剂实验，证明抑制METTL3表达可减缓SHH亚型肿瘤进展。综上，本研究发现揭示了SHH MB发生的表观转录调控机制，并为开发针对SHHMB的治疗靶点提供了新的思路与理论依据。

Yamei Niu, Chinese Academy of Medical Sciences[Bio]

中国医学科学院基础医学研究所研究员、博士生导师，协和侨联秘书长。日本长崎大学药学博士，法国WHO国际癌症研究署及日本久留米大学医学部博士后。主要研究方向为针对脑肿瘤与神经退行性病变等疾病，在分子、细胞及整体实验动物水平上系统性研究表观转录标记RNA m6A修饰在上述疾病中的作用与机制，为脑重大疾病的预警、诊断或治疗提供理论基础。近年来主要研究成果发表在Cell Reports、Cell Discovery、Genome Biology、Molecular Cell与 Genomics, Proteomics & Bioinformatics等国际期刊。

11:10 - 11:40

An evolutionary and genomic approach for dissecting ethnic differences in hepatocellular carcinoma carcinogenesis and treatment
利用进化和基因组学手段探究肝癌的种族特异性以及治疗方案 [Abstract]

肿瘤异质性(Intra-tumor heterogeneity, ITH)是肿瘤进化和治疗的基础，探究肿瘤异质性的起源和进化具有重要的科学意义。随着研究的深入，一个异质性的新层次：癌症的群体间差异（i.e.癌症的种族特异性）快速涌现。在这个报告里，我将以肝癌为研究对象，探究肝癌的种族特异性，并在特异性的基础上，探索如何利用放射加免疫组合治疗方案，针对不同群体设计精准治疗方案。希望通过这个报告，展示如何在解析癌症的种族特异性的同时，构建针对不同族裔的精准治疗方案。

Weiwei Zhai, Institute of Zoology, CAS[Bio]

翟巍巍，中国科学院动物研究所研究员，国家引才计划青年项目获得者。主要研究领域为群体遗传学。他以癌症细胞群体为主要研究对象，围绕着肿瘤异质性的起源和进化,在癌症种族特异性和肿瘤的空间异质性等重要的科学问题上取得了系列性的进展。在Nature Genetics等杂志上发表文章60多篇。曾获美国AAAS Newcomb Cleveland prize (Science杂志2010年最佳论文)、2019年中国医药生物技术十大进展等奖项。现任进化领域核心期刊Molecular Biology and Evolution、中国学术期刊National Science Review编委。

11:40 - 12:00

Molecular Mechanisms of B-Cell Acute Lymphoblastic Leukemia Relapse Under CART Cell Therapy
CART细胞治疗下B细胞急性白血病复发的分子机制研究

Fuhong He, Beijing Institute of Genomics, CAS / China National Center for Bioinformation

Session 3, chaired by Xin Sheng

13:30 - 14:00

Applications of single-cell sequencing in disease and development research
单细胞测序技术在疾病发生和细胞发育研究中的应用 [Abstract]

随着测序技术的进步和测序成本的下降，单细胞测序技术现已被广泛应用于基础科研和临床研究中。其中单细胞转录组测序技术（scRNA-Seq）给我们提供了一个高分辨率探究机体中每个细胞功能的方法，是细胞异质性，细胞发育和分化，疾病发生等相关研究的利器。本专题结合本课题组进行的几个单细胞转录组学分析为主的项目，系统展示单细胞转录组测序技术在相关领域的应用。项目一，先对体外扩增的人软骨细胞进行Il-1b诱导模拟炎症环境，然后进行单细胞测序，分析显示扩增的人软骨细胞（chondrocyte）本身具有细胞异质性，在炎性刺激下，沿着炎性反应和非炎性反应两条路径进行分化。项目二，我们重分析已经发表的主动脉瓣钙化（CAVD）单细胞转录组学数据，鉴定了多种未见报道细胞亚型，同时发现一类特殊的主动脉瓣间质细胞（VIC）可分泌信号分子MDK从而抑制其他VIC的钙化，进一步完善了CAVD发病机制的模型。项目三，我们通过对不同发育时期的小鼠心脏进行单细胞转录组测序和分析，鉴定了一群特殊的单核心肌细胞，为心肌细胞增殖与分化提供新的视角。这些项目的开展，显示出单细胞转录组学技术具有广泛的研究适用对象，在相关研究中的起着重要作用。

Ximiao He, Huazhong University of Science & Technology[Bio]

何西淼博士长期从事表观基因组学、生物信息学与精准医学研究。现任华中科技大学教授，同济医学院基础医学院生理学系主任，基因组与蛋白质组研究中心主任。海外引进高层次人才。本科毕业于同济医科大学，后于北京基因组研究所师从于军教授获基因组学博士学位，并于美国国立卫生研究院国家癌症研究所历任博士后， Research Fellow。已在Genome Research、NAR等期刊上发表论文50余篇。国际计算生物学协会(ISCB)与国际人类基因组组织(HUGO)成员。武汉市生理学会理事长；湖北省生理学会常务理事。 Evolutionary Bioinformatics副主编；Current Medical Science编委。

14:00 - 14:30

Multiomics research in Center for Life Sciences, NLA, NU: directions and challenges

Ainur Akilzhanova, National Laboratory Astana, Nazarbayev University, Kazakhstan

Ulykbek Kairov, National Laboratory Astana, Nazarbayev University, Kazakhstan

14:30 - 15:00

CHD6 Promotes Broad Nucleosome Eviction for Transcriptional Activation of oncogenes and cell identity [Abstract]

Despite being a member of the chromodomain helicase DNA-binding protein family, little is known about the exact role of CHD6 in chromatin remodeling or cancer disease. Here we show that CHD6 binds to chromatin to promote broad nucleosome eviction for transcriptional activation of many cancer pathways. By integrating multiple patient cohorts for bioinformatics analysis of over a thousand prostate cancer datasets, we found CHD6 expression elevated in prostate cancer and associated with poor prognosis. Further comprehensive experiments demonstrated that CHD6 regulates oncogenicity of prostate cancer cells and tumor development in a murine xenograft model. ChIP-Seq for CHD6, along with MNase-Seq and RNA-Seq, revealed that CHD6 binds on chromatin to evict nucleosomes from promoters and gene bodies for transcriptional activation of oncogenic pathways. These results demonstrated a key function of CHD6 in evicting nucleosomes from chromatin for transcriptional activation of prostate cancer pathways. Furthermore, we are studying the mechanism of CHD6 binding to cell identity genes.

Dongyu Zhao, Peking University[Bio]

赵东宇北京大学基础医学院医学生物信息学系执行主任、研究员、博士生导师，血管稳态与重构全国重点实验室PI，国家高层次海外优秀青年人才，北京大学博雅青年学者。中国运筹学会计算系统生物学分会青年理事，中国抗癌协会肿瘤测序及大数据分析专委会委员，中国生物信息学学会表观遗传信息学专委会/系统生物学专委会委员，北京生物信息学研究会会员。 2013年博士毕业于中科院北京基因组研究所，师从于军教授。之后在美国休斯敦Methodist 研究所和哈佛医学院陈开富教授实验室做博士后研究。2021年初回国加入北京大学。研究方向：疾病基因。相关工作以第一/通讯发表在Molecular Cell，Nature Communications（3篇）, Nucleic Acids Research, Oncogene, Cell proliferation, Iscience等国际期刊。

Session 4, chaired by Ximiao He

15:10 - 15:40

Mapping the genetic architecture of human traits to cell types in the kidney identifies mechanisms of disease and potential treatments [Abstract]

The functional interpretation of genome-wide association studies (GWAS) is challenging due to the cell-type-dependent influences of genetic variants. Here, we generated comprehensive maps of expression quantitative trait loci (eQTLs) for 659 microdissected human kidney samples and identified cell-type-eQTLs by mapping interactions between cell type abundances and genotypes. By partitioning heritability using stratified linkage disequilibrium score regression to integrate GWAS with single-cell RNA sequencing and single-nucleus assay for transposase-accessible chromatin with high-throughput sequencing data, we prioritized proximal tubules for kidney function and endothelial cells and distal tubule segments for blood pressure pathogenesis. Bayesian colocalization analysis nominated more than 200 genes for kidney function and hypertension. Our study clarifies the mechanism of commonly used antihypertensive and renal-protective drugs and identifies drug repurposing opportunities for kidney disease.

Xin Sheng, Zhejiang University[Bio]

盛欣，浙江大学“百人计划”研究员，博士生导师。致力于基于遗传学研究复杂疾病的发病机制。在Nature Genetics等期刊发表论文29篇，累计影响因子大于460分，近五年被Cell、Nature、New England Journal of Medicine等杂志引用超过1700次，其中19篇属于高被引论文（h指数=19，i10指数=24），已发表一作（含共一）或通讯论文7篇。承担杭州海外高层次人才计划及国家自然科学基金项目等任务，曾多次在美国肾脏病年会（ASN）等大型国际会议上做特邀口头学术报告。

15:40 - 16:10

Human gut microbiome researches over the last decade: current challenges and future directions[Abstract]

Despite the rapid advances in gut microbiome research, there remain many challenges that need to be addressed. Here we will discuss eight key challenges that we are currently facing, with the aim of shedding light on future research directions.

Hao Wu, Fudan University[Bio]

吴浩，复旦大学人类表型组研究院及复旦微生物组中心，青年研究员，博士生导师；复旦大学附属华山医院代谢与减重外科双聘专家；上海市浦江人才，国家海外高层次人才。兼任生物物理学会肠道菌群分会委员、代谢生物学分会青年委员；生物化学与分子生物学会代谢专业委员会委员；微生物学会生物组专业委员会委员；人体健康科技促进会肠道微生态与肠菌移植专业委员会委员。主要研究肠道菌与饮食互作对肥胖、非酒精性脂肪肝及糖尿病等代谢性疾病发生发展及防治的影响，以第一或通讯作者身份在Nature Medicine，Nature Communications， Cell Metabolism，EMBO molecular medicine， The ISME J等杂志发表多篇学术论文。

16:10 - 16:30

敏捷分析加速生命科学创新

联想公司

16:30 - 16:40

大模型时代下的科研算力基础设施建设

浪潮公司

16:40 - 18:00

产学研专题研讨（11层1103会议室）

专家、企业代表

October 20
Session 1, chaired by Dake Zhang
09:00 - 09:30	Study of marine aquaculture diseases based on environmental genomics 基于环境基因组学的海洋养殖病害研究 Xumin Wang, Yantai University
09:30 - 10:00	Synthetic biology-green intelligent manufacturing of beauty products and healthy material 合成生物学-绿色“智”造美妆和健康原料 Guiming Liu
10:00 - 10:30	Genetic study of neurodevelopmental disorders [Abstract] 儿童神经发育障碍，如注意缺陷多动障碍（ADHD）、孤独症（ASD）等都具有较高的遗传度，我们利用GWAS、WGS等方法发现这些神经发育障碍及其相关内表型的遗传位点，并整合多组学数据以及影像、认知等数据分析遗传位点的功能和作用途径。其次，ADHD和ASD具有较高的共病率，随着年龄的增加，常伴有抑郁、焦虑、睡眠障碍、物质依赖等，我们从共享遗传机制角度研究这些疾病之间的共病机制。同时，我们利用机器学习等方法开发整合这些多维度生物标记物的疾病预测模型和疾病分类模型，有助于疾病的早期诊断与精准诊疗。 Suhua Chang, Peking University Sixth Hospital[Bio] 北京大学第六医院副研究员，博士生导师，国家精神心理疾病临床医学研究中心办公室副主任，主要研究方向是神经发育障碍的遗传机制，整合遗传、多组学、影像、认知等生物医学大数据理解遗传基因的作用机制，构建基于多维数据的疾病风险评估模型。主持国家重点研发计划课题、国家自然基金项目等项目。在Molecular Psychiatry、Biological Psychiatry等国际主流学术期刊发表了 50多篇学术论文。
Session 2, chaired by Suhua Chang
10:40 - 11:00	Characteristics and Origins of Alterations in DNA Methylation Profiles in Ailmented Tissues 组织病变中的DNA甲基化谱改变的特征与来源 Dake Zhang, Beihang University
11:00 - 11:20	Single-Cell Transcriptomics Reveals Immune Reconstitution in Patients with R/R T-ALL/LBL Treated with Donor-Derived CD7 CAR-T Therapy 单细胞转录组揭示CD7 CART治疗R/R T-ALL患者的免疫重建特征 Wei Chen, Beihang University
11:20 - 11:40	Rare diseases: diagnosis and treatment 罕见病的精准诊断与治疗 Qing Zhou, Zhejiang University
11:40 - 12:00	Transcriptomics Study of Cerebral Small Vessel Disease 脑小血管病转录组学研究 [Abstract] Cerebral small vessel disease (CSVD) is a major contributor to stroke and dementia, and it endangers the health of older individuals. However, its clinical diagnosis relies primarily on radiography. Further, omics research and effective biomarkers for CSVD remain scarce, and its pathogenesis has not been fully elucidated. To facilitate an in-depth understanding of CSVD at the molecular level and to characterize the potential risk factors for this disease, we carry out a set of systematic transcriptome studies based on the peripheral blood of 91 Chinese patients with CSVD. We profile transcriptome heterogeneity between patients with CSVD and healthy individuals and perform quantitative real-time PCR to validate randomly selected differentially expressed genes. We also perform a comparative analysis between dementia and non-dementia subgroups, which involves extensive vascular risk factors, clinical measurements, and neuroimaging features. More importantly, a CSVD-prediction model and a dementia-forecasting model have been constructed using machine learning methods. We construct a comprehensive transcriptome map of CSVD, and identify several significant potential biomarkers of the disease. We also identify several differentially distributed neuroimaging and clinical features between the dementia and non-dementia subgroups, including total white matter hyperintensity severity, CSVD burden, neutrophil counts, and triglyceride levels, among others. The newly constructed CSVD-prediction and dementia-forecasting models achieve the highest F1 scores of 0.84 and 0.66, respectively. Both models are thus expected to effectively support clinical predictions and diagnoses. In summary, this study profiles the transcriptome characteristics of CSVD, which provides a foundation for an informative and practical method to facilitate CSVD-related studies and insights into clinical assessments and decisions. Fengyu Wang, Henan Provincial People's Hospital[Bio] 王凤羽，遗传学博士，河南省人民医院（郑州大学人民医院）副研究员。中国卒中学会血管性认知障碍分会委员，中国遗传学会遗传咨询分会委员，河南省预防医学会老年病预防专业委员会常务委员。主要从事神经遗传方面的研究工作。近年来承担以及参与国家省部级课题5项，在国内外学术杂志上发表论文近20篇。参与编写著作2部，获得发明专利授权2项。
Session 3, chaired by Lan Jiang
13:30 - 13:50	Forensic multi omics study on precise identification of difficult biological evidence 疑难生物物证精准鉴识的法医多组学研究 [Abstract] 随着组学测序技术的发展，从聚丙烯酰胺胶和毛细管电泳平台的一代测序到现今发展迅猛的高通量二代测序技术，再到超长读长的单分子三代测序，能够对包括最小生命单元单细胞的DNA和RNA序列进行检测也越来越成为一种主流趋势。借助上述日益成熟的新技术方法，挖掘和验证适用于法医学精准个体识别多组学分子标记成为可能，同时随着人工智能技术的成熟应用，已经出现了“人工智能+其他专业”相结合的浪潮，同样也在很大程度上改变了法医学的基础研究与实践应用。通过整合法医多组学、微生物组学、生物信息学和人工智能等多学科交叉技术，建立适用于疑难生物物证多组学人工智能分析方法和证据解析体系，将为解决法医学疑难生物物证个体识别难题提供新的理论基础和技术创新支撑。 Jiangwei Yan, Shanxi Medical University[Bio] 山西医科大学特聘教授，主任法医师，博士生导师，中国法医学会法医物证专业委员会副主任委员、中国遗传学会法医遗传专业委员会副主任委员、中国刑事技术标准化委员会委员。主要致力于疑难生物物证的法医精准鉴识的法医多组学和人工智能分析创新研究。先后主持包括国家重点研发计划课题、国家自然科学基金重点项目和面上项目、国家重大专项课题等在内的12项国家级科研项目，在FORENSIC SCI INT-GEN、GENOMICS PROTEOMICS & BIOINFORMATICS等期刊上发表论文100余篇，授权专利25项。获得省部级二、三等奖各二项及北京市优秀青年知识分子、北京市政法委十百千人才称号和山西省学术技术带头人和中青年拔尖创新人才称号。
13:50 - 14:10	Functional genomics analysis of wheat based on multi-omics data 基于多组学的小麦功能基因组学分析 Wei Tong, Northwest A&F University
14:10 - 14:30	Enhanced understanding on intra-host evolution of pathogens 基于大数据重新认知病原微生物变异规律 [Abstract] 病原研究中最大的挑战在于如何宿主内判断是新病原还是进化的病原。解决挑战的根本问题在于如何需要获得完整的基因组特征，从深层次上开展系统的研究。随着病原基因组学和生物信息学算法的发展，我们可以针对宿主个体内病原变异的异质性开展研究，明确病原微生物变异在个体内由低丰度到高丰度发生、累积、筛选和传播过程，以推动病原遗传变异、感染与免疫基本理论的重新构建。 Chen Chen, Beijing Shijitan Hospital
14:30 - 14:50	Research progress and prospects of Microbial Environmental Genomics 环境微生物基因组学的研究进展与展望 Yingfeng Luo, Institute of Microbiology, CAS
Session 4, chaired by Wei Tong
15:00 - 15:20	Droplet microfluidics based combinatorial indexing for massive-scale single-cell sequencing 基于微流控组合标记的单细胞测序技术 [Abstract] Single-cell RNA sequencing methods focusing on the 5’-end of transcripts can reveal promoter and enhancer activity and allows efficient profiling of immune receptor repertoire. However, ultra-high-throughput 5′-end single-cell RNA sequencing methods have not been described. Here, we introduce five-prime end single-cell combinatorial indexing RNA-Seq (FIPRESCI), enabling the massive sample multiplexing and increasing the throughput of the droplet-microfluidics system by over 10-fold. FIPRESCI is based on combining the ability of indexed Tn5 transposome to barcoding RNA/cDNA hybrids heteroduplexes in situ and Template Switching Oligo barcoding of commercial droplet-microfluidic. Using FIPRESCI, we profiled transcriptome and transcribed cis-regulatory elements from various human and mouse cell lines and demonstrated the approach is compatible with both cells and nuclei. We applied FIPRESCI to E10.5 whole mouse embryo and uncover many previously unknown isoform switches during GABAergic neurogenesis of important regulators, including Rbfox2. We further applied FIPRESCI to primary T cells from human peripheral blood mononuclear cells (PBMCs) and demonstrated it enables simultaneous identification of cancer patients' specific subpopulation, gene expression, and T cell receptor (TCR) signatures. Given its simplicity, flexibility, and scalability, FIPRESCI will have wide application in cell atlas studies, large-scale screening, and single-cell immune profiling of large cohort studies. Lan Jiang, Beijing Institute of Genomics, CAS / China National Center for Bioinformation[Bio] 蒋岚研究员长期从事表观遗传学和单细胞多组学方面的研究。取得的成果包括开发单细胞组学技术和计算方法(2023 Genome Biology，2023 Cell Reports，2022 Small, 2017 Cell Reports, 2016 Genome Biology), 建立组蛋白修饰介导的印记基因新方向(2017 Nature)，报道DNA甲基化在脊椎动物早期胚胎的跨代遗传规律(2013 Cell)等。目前已发表论文20篇，其中Cell, Nature, Genome Biology等杂志发表第一或者通讯作者论文（含共同）共计10篇，专著1部。曾获得中国科学院院长特别奖、中国科学院百篇优博、吴瑞奖、美国Charles A. King Trust Fellowship, NIH K99 Award等奖项。承担国家海外高层次人才计划项目，国家重点研发计划课题、中国科学院战略性先导科技专项（B类）课题和国家自然科学基金项目等任务。
15:20 - 15:40	The integrative analysis of cancer multi-omics data and database construction 癌症多维组学数据的整合挖掘与知识库体系建设 Jingyao Zeng, Beijing Institute of Genomics, CAS / China National Center for Bioinformation
15:40 - 17:00	人才培养专题研讨（11层1103）专家、青年职工、研究生代表等

October 20

Session 1, chaired by Dake Zhang

09:00 - 09:30

Study of marine aquaculture diseases based on environmental genomics
基于环境基因组学的海洋养殖病害研究

Xumin Wang, Yantai University

09:30 - 10:00

Synthetic biology-green intelligent manufacturing of beauty products and healthy material
合成生物学-绿色“智”造美妆和健康原料

Guiming Liu

10:00 - 10:30

Genetic study of neurodevelopmental disorders [Abstract]

儿童神经发育障碍，如注意缺陷多动障碍（ADHD）、孤独症（ASD）等都具有较高的遗传度，我们利用GWAS、WGS等方法发现这些神经发育障碍及其相关内表型的遗传位点，并整合多组学数据以及影像、认知等数据分析遗传位点的功能和作用途径。其次，ADHD和ASD具有较高的共病率，随着年龄的增加，常伴有抑郁、焦虑、睡眠障碍、物质依赖等，我们从共享遗传机制角度研究这些疾病之间的共病机制。同时，我们利用机器学习等方法开发整合这些多维度生物标记物的疾病预测模型和疾病分类模型，有助于疾病的早期诊断与精准诊疗。

Suhua Chang, Peking University Sixth Hospital[Bio]

北京大学第六医院副研究员，博士生导师，国家精神心理疾病临床医学研究中心办公室副主任，主要研究方向是神经发育障碍的遗传机制，整合遗传、多组学、影像、认知等生物医学大数据理解遗传基因的作用机制，构建基于多维数据的疾病风险评估模型。主持国家重点研发计划课题、国家自然基金项目等项目。在Molecular Psychiatry、Biological Psychiatry等国际主流学术期刊发表了 50多篇学术论文。

Session 2, chaired by Suhua Chang

10:40 - 11:00

Characteristics and Origins of Alterations in DNA Methylation Profiles in Ailmented Tissues
组织病变中的DNA甲基化谱改变的特征与来源

Dake Zhang, Beihang University

11:00 - 11:20

Single-Cell Transcriptomics Reveals Immune Reconstitution in Patients with R/R T-ALL/LBL Treated with Donor-Derived CD7 CAR-T Therapy
单细胞转录组揭示CD7 CART治疗R/R T-ALL患者的免疫重建特征

Wei Chen, Beihang University

11:20 - 11:40

Rare diseases: diagnosis and treatment
罕见病的精准诊断与治疗

Qing Zhou, Zhejiang University

11:40 - 12:00

Transcriptomics Study of Cerebral Small Vessel Disease
脑小血管病转录组学研究 [Abstract]

Cerebral small vessel disease (CSVD) is a major contributor to stroke and dementia, and it endangers the health of older individuals. However, its clinical diagnosis relies primarily on radiography. Further, omics research and effective biomarkers for CSVD remain scarce, and its pathogenesis has not been fully elucidated. To facilitate an in-depth understanding of CSVD at the molecular level and to characterize the potential risk factors for this disease, we carry out a set of systematic transcriptome studies based on the peripheral blood of 91 Chinese patients with CSVD. We profile transcriptome heterogeneity between patients with CSVD and healthy individuals and perform quantitative real-time PCR to validate randomly selected differentially expressed genes. We also perform a comparative analysis between dementia and non-dementia subgroups, which involves extensive vascular risk factors, clinical measurements, and neuroimaging features. More importantly, a CSVD-prediction model and a dementia-forecasting model have been constructed using machine learning methods. We construct a comprehensive transcriptome map of CSVD, and identify several significant potential biomarkers of the disease. We also identify several differentially distributed neuroimaging and clinical features between the dementia and non-dementia subgroups, including total white matter hyperintensity severity, CSVD burden, neutrophil counts, and triglyceride levels, among others. The newly constructed CSVD-prediction and dementia-forecasting models achieve the highest F1 scores of 0.84 and 0.66, respectively. Both models are thus expected to effectively support clinical predictions and diagnoses. In summary, this study profiles the transcriptome characteristics of CSVD, which provides a foundation for an informative and practical method to facilitate CSVD-related studies and insights into clinical assessments and decisions.

Fengyu Wang, Henan Provincial People's Hospital[Bio]

王凤羽，遗传学博士，河南省人民医院（郑州大学人民医院）副研究员。中国卒中学会血管性认知障碍分会委员，中国遗传学会遗传咨询分会委员，河南省预防医学会老年病预防专业委员会常务委员。主要从事神经遗传方面的研究工作。近年来承担以及参与国家省部级课题5项，在国内外学术杂志上发表论文近20篇。参与编写著作2部，获得发明专利授权2项。

Session 3, chaired by Lan Jiang

13:30 - 13:50

Forensic multi omics study on precise identification of difficult biological evidence 疑难生物物证精准鉴识的法医多组学研究 [Abstract]

随着组学测序技术的发展，从聚丙烯酰胺胶和毛细管电泳平台的一代测序到现今发展迅猛的高通量二代测序技术，再到超长读长的单分子三代测序，能够对包括最小生命单元单细胞的DNA和RNA序列进行检测也越来越成为一种主流趋势。借助上述日益成熟的新技术方法，挖掘和验证适用于法医学精准个体识别多组学分子标记成为可能，同时随着人工智能技术的成熟应用，已经出现了“人工智能+其他专业”相结合的浪潮，同样也在很大程度上改变了法医学的基础研究与实践应用。通过整合法医多组学、微生物组学、生物信息学和人工智能等多学科交叉技术，建立适用于疑难生物物证多组学人工智能分析方法和证据解析体系，将为解决法医学疑难生物物证个体识别难题提供新的理论基础和技术创新支撑。

Jiangwei Yan, Shanxi Medical University[Bio]

山西医科大学特聘教授，主任法医师，博士生导师，中国法医学会法医物证专业委员会副主任委员、中国遗传学会法医遗传专业委员会副主任委员、中国刑事技术标准化委员会委员。主要致力于疑难生物物证的法医精准鉴识的法医多组学和人工智能分析创新研究。先后主持包括国家重点研发计划课题、国家自然科学基金重点项目和面上项目、国家重大专项课题等在内的12项国家级科研项目，在FORENSIC SCI INT-GEN、GENOMICS PROTEOMICS & BIOINFORMATICS等期刊上发表论文100余篇，授权专利25项。获得省部级二、三等奖各二项及北京市优秀青年知识分子、北京市政法委十百千人才称号和山西省学术技术带头人和中青年拔尖创新人才称号。

13:50 - 14:10

Functional genomics analysis of wheat based on multi-omics data
基于多组学的小麦功能基因组学分析

Wei Tong, Northwest A&F University

14:10 - 14:30

Enhanced understanding on intra-host evolution of pathogens
基于大数据重新认知病原微生物变异规律 [Abstract]

病原研究中最大的挑战在于如何宿主内判断是新病原还是进化的病原。解决挑战的根本问题在于如何需要获得完整的基因组特征，从深层次上开展系统的研究。随着病原基因组学和生物信息学算法的发展，我们可以针对宿主个体内病原变异的异质性开展研究，明确病原微生物变异在个体内由低丰度到高丰度发生、累积、筛选和传播过程，以推动病原遗传变异、感染与免疫基本理论的重新构建。

Chen Chen, Beijing Shijitan Hospital

14:30 - 14:50

Research progress and prospects of Microbial Environmental Genomics
环境微生物基因组学的研究进展与展望

Yingfeng Luo, Institute of Microbiology, CAS

Session 4, chaired by Wei Tong

15:00 - 15:20

Droplet microfluidics based combinatorial indexing for massive-scale single-cell sequencing
基于微流控组合标记的单细胞测序技术 [Abstract]

Single-cell RNA sequencing methods focusing on the 5’-end of transcripts can reveal promoter and enhancer activity and allows efficient profiling of immune receptor repertoire. However, ultra-high-throughput 5′-end single-cell RNA sequencing methods have not been described. Here, we introduce five-prime end single-cell combinatorial indexing RNA-Seq (FIPRESCI), enabling the massive sample multiplexing and increasing the throughput of the droplet-microfluidics system by over 10-fold. FIPRESCI is based on combining the ability of indexed Tn5 transposome to barcoding RNA/cDNA hybrids heteroduplexes in situ and Template Switching Oligo barcoding of commercial droplet-microfluidic. Using FIPRESCI, we profiled transcriptome and transcribed cis-regulatory elements from various human and mouse cell lines and demonstrated the approach is compatible with both cells and nuclei. We applied FIPRESCI to E10.5 whole mouse embryo and uncover many previously unknown isoform switches during GABAergic neurogenesis of important regulators, including Rbfox2. We further applied FIPRESCI to primary T cells from human peripheral blood mononuclear cells (PBMCs) and demonstrated it enables simultaneous identification of cancer patients' specific subpopulation, gene expression, and T cell receptor (TCR) signatures. Given its simplicity, flexibility, and scalability, FIPRESCI will have wide application in cell atlas studies, large-scale screening, and single-cell immune profiling of large cohort studies.

Lan Jiang, Beijing Institute of Genomics, CAS / China National Center for Bioinformation[Bio]

蒋岚研究员长期从事表观遗传学和单细胞多组学方面的研究。取得的成果包括开发单细胞组学技术和计算方法(2023 Genome Biology，2023 Cell Reports，2022 Small, 2017 Cell Reports, 2016 Genome Biology), 建立组蛋白修饰介导的印记基因新方向(2017 Nature)，报道DNA甲基化在脊椎动物早期胚胎的跨代遗传规律(2013 Cell)等。目前已发表论文20篇，其中Cell, Nature, Genome Biology等杂志发表第一或者通讯作者论文（含共同）共计10篇，专著1部。曾获得中国科学院院长特别奖、中国科学院百篇优博、吴瑞奖、美国Charles A. King Trust Fellowship, NIH K99 Award等奖项。承担国家海外高层次人才计划项目，国家重点研发计划课题、中国科学院战略性先导科技专项（B类）课题和国家自然科学基金项目等任务。

15:20 - 15:40

The integrative analysis of cancer multi-omics data and database construction
癌症多维组学数据的整合挖掘与知识库体系建设

Jingyao Zeng, Beijing Institute of Genomics, CAS / China National Center for Bioinformation

15:40 - 17:00

人才培养专题研讨（11层1103）

专家、青年职工、研究生代表等

The 8th Big Data Forum for Life and Health Sciences

The 8th Big Data Forum for Life and Health Sciences (October 18-20, 2023)

Organizing Committee

参展单位

Previous Conferences

Invited Speakers

Agenda