MethBank, a comprehensive database of DNA single-base resolution methylation profiles across a variety
of species, not only integrates 1363 whole genome single-base methylome for 23 species, covering 208
tissues/cell lines and 15 biological contexts, but also provides manually curate knowledge of both
featured differentially methylated genes (DMGs) across 11 kinds of biological contexts like disease and
methylation tools collection. Besides, MethBank also provides analysis tools including Age Predictor,
IDMP and DMR toolkit to help related research of biologists.
1. Data Collection
High quality raw sequencing data of WGBS samples are acquired from accessible data repositories
The data included in the database were searched according to the following filtering rules:
(1) The status of data resource is open-access
(2) [DataSet Type] = "methylation profiling by high throughput sequencing"
(3) [All Fields] = "WGBS" or "BS-Seq" or "Whole Genome Bisulfite Sequencing"
(4) The predicted sequencing depth of WGBS sample should be greater than 10.
Knowledge is curated from publications retrieved in PubMed.
For Featured DMGs, publication search followed the following rules:
(1) The keyword matching "WGBS", "whole-genome bisulfite sequencing", "RRBS"
and "whole-genome DNA methylation"
(2) Publicated in the past twelve years (2010-present)
(3) Publications associated with featured DMGs.
The data after the initial screening are manually curated again.
2. Data Analysis
WGBS data analysis pipeline includes quality control, trim adapters and low quality bases, align to
reference genome, extract CpG methylation, quantify Gene/CpG island average methylation level,
high methylated CpG islands analysis (genomic location enrichment, GO & KEGG enrichment), genes
related to methylated CpG islands analysis, differentially methylated regions (DMRs) analysis
(genomic location enrichment, GO & KEGG enrichment).
First, all bisulfite sequence were subjected to quality control by FastQC v0.11.7 and trimmed to
remove adaptors and low quality bases using Fastq-dump (sratoolkit.2.8.2-1). Next, the reads that
passed quality control were mapped to the reference genome of the corresponding species using
Bismark-0.22.3. Detailed species reference genome information is shown in the table below. We used the
Bismark methylation extractor to extract methylation data from
aligned, filtered reads. To visualize the quality of the data, we computed 6 indicators: Mapping
rate, Unique mapping rate, Genome coverage, C coverage, Conversion rate and Depth. The corresponding
calculations are shown below.
||Genome Version for Mapping
||Genome Version for Annotation
|Canis lupus familiaris
Then, bedtools v2.17.0 and python3 were used to analyze gene methylation profiles of promoter, body and
downstream regions in the C/CG/CH context. Subsequently, the relationship between CpG island and gene
were studied. From the perspective of CpG island, we calculated the DNA methylation level corresponding
to CpG island, and selected highly methylated CpG island (average methylation level >= 0.6) as the
research object for downstream analysis including genome enrichment, GO & KEGG enrichment. From the
perspective of genes, we provide all CpG islands that overlapped with genes and counted the
corresponding location information.
Finally, we identify the differential methylation regions (DMRs) by using DSS R-package for the typical
biological contexts in single project, and analyze genomic location enrichment, GO & KEGG enrichment.
3. Meta Curation
The manual curation of metadata are done on 2 levels ('Project' and
Sample'). The corresponding review contents and standards are as follows:
1. Home Page
You can fill in the entry of interest in the input box to search and view the
corresponding data and information that you need.
For example, you can search for a list of projects, samples, publications, and featured DMGs related to
human WGBS data by entering "Homo sapiens" and selecting item.
You can enter a gene name, such as TP53, to obtain the specific information of the gene in multiple
species and its related publications and featured DMGs at the WGBS level.
You can also directly click the specific areas in the four sections
of "Data Resource", "Knowledge Curations", "Tools" and "Methylation Snapshots" below to jump to the
overall page of the corresponding section.
2. Methylome Browser Page
You can view the reference genome sequence, gene annotation, and distribution
of CpG islands for specific species and specific genes on the Methylome Browser. You can also explore
the differentially methylated regions in specific experiments in the select Track.
3. Data Resources
3.1 Projects Page
On the "Projects" page, you can click "Display column" to customize the
columns of displayed information of projects in the table. On the left side of the page, you can click
on “Species”, “Animal Tissue” and “Human Disease”, not only to view the distribution of the internal
tree structure, but also to further jump to the table of contents of Projects under a specific category.
All forms in the MethBank can be downloaded.
3.2 Samples Page
On the "Samples" page, you can click “Show Columns” and options under “Show
Columns” to customize the displayed sample information and filter samples with specific criteria, as
shown in the following figure below. A detailed explanation of Quality Assessment can be found in
3.3 Genes Page
On the "Genes" page, clicking define gene ID hyperlinks will jump to the page
of methylation overview on gene. Take human gene ENSG00000000003 for example, you can not only see the
relevant information of this gene, but also the table and line graph of the methylation profiles of CG
and CH (H= A, T or C) of this gene in specific samples.
3.4 CpG Islands Page
You can select a specific species and a specific sample to view the results
of chromosome distribution, GO, KEGG and Genomics Location of high methylated CpG islands and the result
of the catalog of genes related to methylated CpG islands for the corresponding sample.
3.5 DMRs Page
You can select a specific species, project, group, and P-value and length on
the left panel to view the corresponding DMR distribution on chromosome, GO&KEGG enrichment, genomics
3.6 Publications Page
The page contains the information on the corresponding publications in Data
resource, Tool Collections and Featured DMGs modules.
4.1 Featured DMGs Page
The Featured DMGs module summarized biological context-associated featured DMGs via full-scale manual
curation to raise the potential availability of retrieving epigenetic marker genes and shared properties for
kinds of the biological scene. This page presents 266 DNA methylation-related publications that we have
curated by establishing a standardized curation process. You can view detail information in the table in the
lower part of the page, including species, tissues, diseases, conditions, enrichment functions, featured
You can also explore the relationship between genes and diseases at the
DNA methylation level through the interactive graphs in the Disease Network and Gene Network sections.
Tissue Sunburst panel illustrated the tissue distribution of every biological conditions.
4.2 Tool Collections Page
Tool Collections page provides 501 methylation related tools collected by
predefined keywords from the original literature and web sources. It is characterized by diverse
categories, types, operating systems and other indicators. These tools are grouped into five main types:
application/script, framework/library, package/module, and toolkit/suite. You can enter specific
conditions on the left side of the Tools Collection page to view the corresponding tools.
5.1 Age predictor
From this page, you can download all the single base precision methylation
data and their annotation information provided in the database. We also provide the sex-specific 450K
and 850K methylation data of 111 healthy human tissues.
Data can be downloaded by clicking the icon on the page or using an FTP tool (such as FileZilla Client).
If you have any questions, comments or suggestions, please send us an email
at methbank(AT)big.ac.cn, and, we will give corresponding at the first time.
Methbank is free for academic use only. For any commercial use, please contact
commercial licensing terms.