Biological research has entered the era of big data,
including a wide variety of omics data and covering a broad range of health data. Such big data is
generated at ever-growing rates and distributed throughout the world with heterogeneous standards and
diverse limited access capabilities. However, the promise to translate these big data into big knowledge
can be realized only if they are publicly shared. Thus, providing open access to omics & health big data
is essential for expedited translation of big data into big knowledge and is becoming increasingly vital
in advancing scientific research and promoting human healthcare and precise medical treatment.
腾讯会议ID: 980323108
腾讯会议链接: https://meeting.tencent.com/s/bFmF92QlDLpu
腾讯直播间: https://meeting.tencent.com/l/Tcz5EefvQ8lD
It is our great pleasure to announce that the 2020 Big Data
Forum for Life and Health Sciences will be held on October 15, 2020. A few renowned biomedical data
scientists have agreed to give speeches. Likely, you are also cordially invited to share your work and
participate in this exciting event.
October 15 Thursday |
08:50 - 09:00 |
Welcome and Opening Remarks |
09:00 - 10:30 |
Session 1: Omics Data & Human Diseases, chaired by Zhang Zhang |
09:00 - 09:30 |
Keynote talk 1: Metabolic Reprogramming in Cancer:
the bridge that connects intracellular stress and cancer behaviors
Ying Xu, University of Georgia, USA
[Abstract]
Cancer has been considered as a genomic disease, which has served as the guiding principle
in cancer research and the basis for cancer diagnosis and treatment. However, increasingly
more researchers have challenged this viewpoint in the past decade since it could not answer
too many cancer related questions! We have been developing a cancer evolutionary theory
in the past few years. The key idea is: persistent inflammation of certain types will lead
to increased local H2O2 and iron concentrations, which together will give rise to Fenton
reaction: Fe2+ + H2O2 -> Fe3+ + ∙OH + OH-. If the environment is also rich in O_2^(∙-),
which is predominantly released from neutrophils in cancer tissues, O_2^(∙-) can reduce
Fe3+ back to Fe2+, hence driving the reaction to go on as long as O_2^(∙-) is available.
We have discovered that (1) all cancer tissues in TCGA have persistent Fenton reactions
in their cytosol and mitochondria, and (2) the rates of cytosolic Fenton reactions will
saturate the pH buffer quickly, hence driving the cytosolic pH up if not neutralized.
Our next key finding is that the affected cells utilize a wide range of metabolic
reprogramming (MR) to produce more protons to keep the Fenton reaction-produced OH-
neutralized. We have studied some 50 MRs in 14 cancer types, which each produce more
protons compared to the original metabolism. Further analyses suggest that the affected
cells use cell division as way to rid of the persistently produced nucleotides. I will
explain how other clinical behaviors of cancer may be driven by other reprogrammed metabolisms,
mainly to remove their end- or intermediate products so the proton-producing MRs can continue
and keep the affected cell alive.
|
09:30 - 09:50 |
Mental disorder study in the big data era
Jing Wang, Institute of Psychology, Chinese Academy of Sciences
[Abstract]
21世纪是大数据的时代,伴随着高通量测序等技术的迅速发展,生物和医学大数据正在急剧积累,如何从这些数据中提炼
出有用的信息与知识,是当前科研工作者,包括临床专家共同关注的话题。以几种常见高发心理疾患作为研究对象,我们
建立了一套独具特色的心理疾患数据整合与挖掘策略,从不同角度、不同层面对疾病组学数据进行整合,在整合的基础上
,通过生物信息分析进行数据挖掘,以期揭示疾病可能的候选分子标志物与机制。同时,我们开发了一系列工具,包括全
基因组关联学习(GWAS)数据的通路分析工具、遗传数据调控功能分析工具等,从不同维度研究候选致病位点与可能的
机制。上述系列数据库和工具的累计访问用户逾25万人,累计页面访问量逾5000万次。面向未来,充分利用大数据带给
我们的机遇与挑战,采用疾病研究的“一体化”策略,开展多维度数据整合与数据挖掘,将有助于我们更好地揭示疾病的机
制,实现传统医学模式向精准医学的转变。
|
09:50 - 10:10 |
Computational biological hypothesis generation using omics data
Peng Yu, West China Hospital of Sichuan University
|
10:10 - 10:30 |
Mutations in the RNA Splicing Factor SF3B1 Promote Tumorigenesis
Zhaoqi Liu, Beijing Institute of Genomics, CAS / China National Center for Bioinformation (CNCB)
|
10:30 - 12:10 |
Session 2: Data Integration & Deep Mining, chaired by Jingfa Xiao |
10:30 - 11:00 |
Keynote talk 2: Multi-Omics Integration for Cancer Related Pattern Discovery
Lin Gao, Xidian University
[Abstract]
The mechanism, diagnosis and prognosis of cancer is one of the core researches problem in life
science and related multidisciplinary domain. The challenge is that the progression process
of a cancer is a highly dimensional, time varying, and dynamic system. How do we discover
cancer-causing patterns, cancer subtyping and finally associate these patterns with cancer
initiation, progression andtherapy. With increasing amounts of multi-omics/single-cell multi-
omics data becoming available, we can construct the computational model of those kinds of
data by different kinds of model.The system biology and complex network provide new insight
for cancer. In this talk, I will investigate network models for different patterns for cancer
with multi-omics data integration. The key methodological challenges face in computational
disease modeling.
|
11:00 - 11:20 |
Methods to characterize chromatin domains using ultra-low resolution Hi-C data
Zhihua Zhang, Beijing Institute of Genomics, CAS / China National Center for Bioinformation (CNCB)
|
11:20 - 11:40 |
Repeat-derived RNAs help maintain heterochromatin
Shunmin He, Institute of Biophysics, Chinese Academy of Sciences
[Abstract]
Retrotransposons are populated in vertebrate genomes,
which, when active, are thought to cause genome instability with potential benefit to genome
evolution. Retrotransposon-derived RNAs are also known to give rise to small endo-siRNAs to
help maintain heterochromatin at their sites of transcription; however, as not all heterochromatic
regions are equally active in transcription, it remains unclear how heterochromatin is maintained
across the genome. Here, we address these problems by defining the origins of repeat-derived
RNAs and their specific chromatin registers in Drosophila S2 cells. We demonstrate that repeat
RNAs are predominantly derived from active gypsy elements and processed by Dcr-2 into small
RNAs to help maintain pericentromeric heterochromatin. We also show in cultured S2 cells that
synthetic repeat-derived endo-siRNA mimics are sufficient to rescue Dcr-2 deficiency-induced
defects in heterochromatin formation in interphase and chromosome segregation during mitosis,
demonstrating that active retrotransposons are required for stable genetic inheritance.
|
11:40 - 12:10 |
Keynote talk 3: Pathway-guided analysis of alternative splicing during cancer progression
Yi Xing, The Children’s Hospital of Philadelphia & University of Pennsylvania, USA
[Abstract]
Aberrant pre-mRNA alternative splicing (AS) is
widespread in cancer, but the causes and consequences of AS dysregulation during cancer
progression are not well understood. We developed a novel computational framework, PEGASAS,
as a pathway-guided approach for examining the effects of oncogenic signaling on exon
incorporation. PEGASAS was designed to study the interplay among oncogenic signaling, AS, and
affected biological processes. In this study, we applied PEGASAS to define the AS landscape
across prostate cancer disease states and the relationship between splicing and known driver
alterations. We compiled a meta-dataset of RNA-seq data of 876 tissue samples from publicly
available sources, covering a range of disease states, from normal tissues to aggressive
metastatic tumors. PEGASAS analysis revealed a correlation between Myc signaling and splicing
changes in RNA binding proteins (RBPs), suggestive of a previously undescribed auto-regulatory
phenomenon. We experimentally verified this result in a human prostate cell transformation
assay. Our findings establish a role for Myc in regulating RNA processing by controlling
incorporation of nonsense mediated decay determinant exons in RBP-encoding genes. In conclusion,
PEGASAS can mine large-scale transcriptomic data to connect changes in pre-mRNA AS with oncogenic
alterations that are common to many cancer types.
|
12:10 - 13:30 |
Break |
13:30 - 15:20 |
Session 3: Human Population Genomics & Public Health, chaired by Yiming Bao |
13:30 - 14:00 |
Keynote talk 4: 古基因组探究东亚人群史前人群演化
Qiaomei Fu, Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences
[Abstract]
不同时间和地点的人类古基因组数据的涌现,带来大时空框架下研究人类遗传演化的可能。尤其是古基因组捕获技术的应用与
发展,为获取东亚史前南方人群基因组开辟道路,由此为东亚早期南方人群研究带来新的见解。本报告将着重阐述古基因组所
揭示的东亚史前人群演化历史。从旧石器时代的东亚早期现代人“田园洞人”基因组至新石器时代以来东亚南北方人群基因组的
研究,阐明东亚史前不同现代人群的遗传特点及与其他欧亚人群的遗传联系,揭示新石器时代以来东亚南北方人群的遗传差异
及迁徙融合过程,明确南岛语族的起源。这些研究反映了东亚史前人群的多样性及遗传历史的复杂性,凸显出人群迁徙与基因
流动在东亚现今人群结构的形成和发展中所发挥的重要作用。
|
14:00 - 14:20 |
Admixture history of Singapore Peranakan Chinese revealed by whole genome sequencing analysis
Chaolong Wang, Huazhong University of Science and Technology
[Abstract]
Peranakan Chinese, who are descendants of Chinese
immigrants settled in the Malay Archipelago ~300-500 years ago, have developed their unique
culture that preserves Chinese traditions with a strong influence from local Malays. Yet,
whether genetic admixture co-occurred with the cultural mixture has been an ongoing debate
historically. We performed whole genome sequencing (WGS) on 177 Singapore Peranakans and 28
indigenous Malays from Indonesia, and analyzed jointly with WGS data from the SG10K Project
and the 1000 Genomes Project. We estimated that Peranakan Chinese inherited ~5.62% (95%
confidence interval [CI]: 4.75-6.46%) Malay ancestry, much higher than that in the general
Singapore (SG) Chinese (1.08%, 0.69-1.53%), southern Chinese (0.86%, 0.57-1.31%), and northern
Chinese (0.25%, 0.18-0.33%). A sex-biased admixture history, in which the Malay ancestry was
contributed primarily by females, was supported by analyses of the X chromosome, and
mitochondrial and Y haplogroups. Finally, we identified an ancient admixture event shared by
Peranakan Chinese and SG Chinese at ~1,612 (95% CI: 1,345-1,923) years ago, coinciding with
the settlement history of Han Chinese in southern China, and a recent admixture event unique
to Peranakan Chinese at ~190 (159-213) years ago. Our results support the hypothesis that
genetic admixture co-occurred with cultural mixture in forming the Peranakan Chinese community
and uncovered historical admixture events in southern Chinese.
|
14:20 - 14:40 |
新冠病毒基因组信息资源整合与序列变异分析
Shuhui Song, Beijing Institute of Genomics, CAS / China National Center for Bioinformation (CNCB)
|
14:40 - 15:00 |
The microbiota of respiratory tract: progress, challenge, and perspective
Mingkun Li, Beijing Institute of Genomics, CAS / China National Center for Bioinformation (CNCB)
[Abstract]
Investigation of the respiratory tract microbiota is a relatively young field; however, there
has been remarkable progress in understanding the composition and function of the respiratory
tract microbiota in the past few years. Alterations of the respiratory tract microbiota have
been observed in many respiratory diseases, including chronic obstructive pulmonary disease
(COPD), asthma, and cystic fibrosis, but underlying mechanisms and interactions with host genes
are largely unknown. Meanwhile, technologies developed for respiratory tract microbiota have
the potential to identify the pathogen that causes an infection in the respiratory tract.
For instance, SARS-CoV-2 was first identified in the metatranscriptome data of the bronchoalveolar
lavage fluid (BALF).
Our lab has conducted respiratory tract microbiota analysis on over 2000 samples, including
the oropharyngeal swab, sputum, and BALF which were collected from pneumonia, COPD, COVID-19
patients, and healthy controls, to disentangle the association between the respiratory tract
microbiota and disease progression. Meanwhile, we were also working on the development and
optimization of the methods and protocols to manipulate different types of specimens as well
as new algorithms to analyze the data.
|
15:00 - 15:20 |
Roles of gut microbes in the pathogenesis and development of obesity, NAFLD, and diabetes: a systems perspective
Hao Wu, University of Gothenburg Sahlgrenska Hospital, Sweden
[Abstract]
The human gut microbiota encompasses a densely populated ecosystem that provides essential
functions for host development, immune maturation, and metabolism. Alterations to the gut
microbiota have been observed in numerous diseases, including human metabolic diseases such
as obesity, non-alcoholic fatty liver disease (NAFLD) and type 2 diabetes (T2D). However,
few studies have validated causality in humans and the underlying mechanisms remain largely
to be elucidated. We discuss how systems biology approaches combined with new experimental
technologies may disentangle some of the mechanistic details in the complex interactions of
diet, microbiota, and host metabolism and may provide testable hypotheses for advancing our
current understanding of human–microbiota interaction.
|
15:20 - 17:20 |
Session 4: Big Data Resources & Clinical Bioinformatics, chaired by Shuhui Song |
15:20 - 15:50 |
Keynote talk 5: 国家生物信息中心数据资源
Yiming Bao, Beijing Institute of Genomics, CAS / China National Center for Bioinformation (CNCB)
[Abstract]
Genome data are increasing dramatically as the result
of new technologies. Often, these data are required to be deposited into international databases
such as DDBJ, EBI and NCBI, in order to obtain accession numbers needed for publication. This
could be challenging sometimes for researchers in China because of large data size, slow data
transfer due to limited international internet bandwidth, and language barrier and technical
issues in communication. To alleviate these problems, the BIG Data Center (BIGD, https://bigd.big.ac.cn)
was launched in 2016 at Beijing Institute of Genomics (BIG), Chinese Academy of Sciences (CAS).
During the past few years, BIGD has grown and expanded considerably and became one of the major
global centers. In 2019, the National Genomics Data Center (NGDC) was created based on BIGD.
Later in the same year, BIG was given the title of China National Center for Bioinformation (CNCB).
CNCB will be built on the well-established NGDC multi-omics databases such as Genome Sequence
Archive (GSA), Genome Variation Map (GVM), Genome Warehouse (GWH) and 2019 Novel Coronavirus
Resource (2019nCoVR), together with specialized resources from many institutions under CAS and
other ministries. CNCB is dedicated to providing freely accessible data repositories and a
variety of data resources in support of worldwide research activities.
|
15:50 - 16:10 |
Proteome-scale analysis of phase-separated proteins in immunofluorescence images
Tingting Li, Peking University School of Basic Medical Sciences
|
16:10 - 16:30 |
iCTCF: an integrative resource of chest computed tomography images and clinical features of patients with COVID-19 pneumonia
Yu Xue, Huazhong University of Science and Technology
[Abstract]
The outbreak of coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been initially reported in Wuhan, China since December, 2019. Here, we report a timely and comprehensive resource named integrative computed tomography (CT) images and clinical features (CFs) for COVID-19 (iCTCF) to archive chest CT images, 130 types of CFs, and laboratory-confirmed SARS-CoV-2 clinical status from 1521 patients with or without COVID-19 pneumonia, reaching a data volume of 265.1 GB. To facilitate COVID-19 diagnosis, we integrate the heterogeneous CT and CF datasets, and develop an engineering framework of Hybrid-learning for UnbiaSed predicTion of COVID-19 patients (HUST-19) to predict morbidity and mortality outcomes. From the results, we find the integration of CT and CF datasets achieves a striking accuracy with area under the curve (AUC) values of 0.921, 0.931 and 0.856 for predicting mild/regular, severe/critically ill, and deceased cases, much higher than that when exclusively using either CT or CF data. Together with HUST-19, iCTCF can serve as a fundamental resource for improving the diagnosis and management of COVID-19 patients.
|
16:30 - 16:50 |
ENA数据库介绍
Xin Liu, European Bioinformatics Institute, UK
[Abstract]
ENA数据结构,数据存取的挑战,数据库总框架,提取数据的流程
|
16:50 - 17:20 |
Keynote talk 6: From promiscuous nucleotide modification to precise genome editing
Li Yang, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences
[Abstract]
A series of deaminase enzymes, including both cytidine
(a.k.a APOBECs/AIDs) and adenosine (a.k.a. ADARs) deaminases, catalyze cytidine-to-uridine
(C-to-U) or adenosine-to-inosine (A-to-I) base modification in RNA. Interestingly, cytidine
deaminases can also catalyze cytidine(s) in single stranded regions of genomic DNA, resulting
in C-to-U base substitutions and eventually C-to-T mutations in genome. Thousands of promiscuous
C-to-T mutations in cancer genome have been suggested to be associated with cytidine deaminases.
Strikingly, the combination of deaminase enzymes with CRISPR/Cas9 protein achieves targeted
base editing at single nucleotide resolution in genome, referred to as base editor (BE) system.
Recently, we and other have developed a series of novel BEs, including but not limited to
dCpf1-BEs that conjugate catalytically dead Cpf1 with APOBEC and hA3A-BEs that conjugate human
APOBEC3A with nCas9. These newly developed BEs not only expand editing scopes, but also shed
new light on their potential applications in biomedical research, biotechnology and therapeutics
with high precision. Here, I will summarize the recent progress of genome editing systems from
a view of single base resolution, highlighting their advances and discussing distinct mechanisms
of off-target effects for future improvement.
|