BIG Search

BIG Search is a scalable text search engine built based on ElasticSearch (a highly scalable open-source full-text search and analytics engine based on Apache Lucene). It features cross-domain search and facilitates users to gain access to a wide range of biomedical data, not only from NGDC databases but also partner databases throughout the world.

e.g., PRJCA000126;SAMC000385;tp53;EGFR; human; KaKs_Calculator

170,140 records from 66 NGDC & Partner databases.

Database Records Number Description
Animal-SNPAtlas 10,281 Animal-SNPAtlas: a comprehensive SNP database for multiple animals.
GSA 3,945 Genome Sequence Archive
BioProject 1,106 Biological Project Library
BioSample 548 Biological Sample Library
DMS_PMO 482 a standardized ontology for human precision medicine with consistent, reusable and sustainable descriptions of human disease terms, genomic molecular, phenotype characteristics and related medical vocabulary disease concepts through collaborative efforts of researchers at Institute of Medical Information, Chinese Academy of Medical Sciences.
iEKPD 323 Integrated annotations for Eukaryotic protein Kinases, protein Phosphatases & phosphoprotein-binding Domains
PancanQTL 273 A database to systematically identify cis-eQTLs and trans-eQTLs in 33 cancer types.
RMVar 257 RNA Modification associated variants database
ncRNA-eQTL 219 A database to evaluate the effects of SNPs on ncRNA expression
Pancan-meQTL 216 A database to evaluate the effects of SNPs on methylation.
Pancan-MNVQTLdb 113 A database to evaluate the effects of MNVs on multiple molecular phenotypes
EKPD 109 Eukaryotic Kinase and Phosphatase Database
AnimalTFDB 102 Animal Transcription Factor Database
DMS_MeSH 64 MeSH (Medical Subject Headings) is the NLM controlled vocabulary thesaurus used for indexing articles for PubMed.
HGD 34 Homologous Gene Database
DMS_Ensembl 17 Ensembl supports research in comparative genomics, evolution, sequence variation and transcriptional regulation. Ensembl annotate genes, computes multiple alignments, predicts regulatory function and collects disease data.
MethBank 4.0 17 a database of DNA methylation across a variety of species
Gene Expression Nebulas 17 A data portal of transcriptomic profiles across multiple species
GeneOntology 14 The Gene Ontology knowledgebase provides a computational representation of our current scientific knowledge about the functions of genes (or, more properly, the protein and non-coding RNA molecules produced by genes) from many different organisms, from humans to bacteria. It is widely used to support scientific research, and has been cited in tens of thousands of publications.
ASCancer Atlas 14 A comprehensive knowledgebase of alternative splicing in human cancers
NucMap 14 A database of genome-wide nucleosome positioning map across species.
GSA for Human 13 Genome Sequence Archive for Human
DMS_Swissprot 11 UniProtKB/Swiss-Prot is the expertly curated component of UniProtKB (produced by the UniProt consortium). It contains hundreds of thousands of protein descriptions, including function, domain structure, subcellular location, post-translational modifications and functionally characterized variants.
SEGreg 11 Database of specifically expressed genes and regulation
NODE 10 The National Omics Data Encyclopedia
RhesusBase Genes 9
MethBank SRMs 7 Methbank, Single-base Resolution Methylomes (SRMs)
VCG 7 Virtual Chinese Genome Database is a dynamic genome database of Chinese population.
DMS_SnomedCT_US 5 The SNOMED CT United States (US) Edition is the official source of SNOMED CT for use in US healthcare systems. The US Edition is a standalone release that combines the content of both the US Extension and the International releases of SNOMED CT.
BioCode 5 Archive Bioinformatics Codes for Open Source Projects
DEG 5 Database of Essential Genes
DMS_Chemical Entities of Biological Interest Ontology 4 Chemical Entities of Biological Interest (ChEBI) is a freely available dictionary of molecular entities focused on ‘small’ chemical compounds.
LeukemiaDB 4 LeukemiaDB collects 3068 samples in 188 leukemia-associated RNA-seq datasets from NCBI GEO and SRA.
EPSD 4 Eukaryotic Phosphorylation Site Database
hTFtarget 4 In this hTFtarget database, we collected comprehensive human TF ChIP-Seq data and customized an analysis workflow to identify reliable TF targets with taking epigenomic states into account
lncRNASNP v3 4 lncRNASNP v3: a comprehensive resources for functional variants in long non-coding RNAs.
DMS_ProteinOntology 3 PRO provides an ontological representation of protein-related entities by explicitly defining them and showing the relationships between them. Each PRO term represents a distinct class of entities (including specific modified forms, orthologous isoforms, and protein complexes) ranging from the taxon-neutral to the taxon-specific (e.g. the entity representing all protein products of the human SMAD2 gene is described in PR:Q15796; one particular human SMAD2 protein form, phosphorylated on the last two serines of a conserved C-terminal SSxS motif is defined by PR:000025934).
BrainBase 3 Brain Disease Knowledgebase
Database Commons 3 a curated catalogue of biological databases.
dbPAF 3 database of Phospho-sites in Animals and Fungi
lnCAR 3 lnCAR | A comprehensive resource for lncRNAs from Cancer Arrays
HGNC 2 The HGNC is responsible for approving unique symbols and names for human loci, including protein coding genes, ncRNA genes and pseudogenes, to allow unambiguous scientific communication.
animalAPAdb 2 A comprehensive animal alternative polyadenylation database
animaleRNAdb 2 A comprehensive animal enhancer RNA database
BBCancer 2 BBCancer: an expression atlas of blood-based biomarkers in the early diagnosis of cancers
CancerSEA 2 CancerSEA: a cancer single-cell state atlas
Cell Taxonomy 2 Cell Taxonomy is a curated repository of cell types with multifaceted characterization.
GenTree 2 GenTree, the time tree of genes along the evolutionary history
GVM 2 Genome Variation Map
Platelets expression atlas 2 Platelet Expression Atlas (PEA) is a comprehensive expression resource and functional analysis platform for human platelets
lncRNASNP2 2
Methbank CRMs 2 Methbank, Consensus Reference Methylomes (CRMs)
PLMD 2 Protein Lysine Modifications Database
PTMD 2 A database of human disease-associated post-translational modifications
DMS_SnomedCT_International 1 SNOMED International determines global standards for health terms, an essential part of improving the health of humankind. We are committed to maintaining and growing our leadership as the global experts in healthcare terminology, ensuring SNOMED CT is the global language for clinical terms.
DiseaseEnhancer 1 DiseaseEnhancer: a resource of human disease-associated enhancer catalog.
EWAS Atlas 1 A knowledgebase of epigenome-wide association studies
EWAS Data Hub 1 A data hub of DNA methylation array data and metadata
EDK 1 Editome Disease Knowledgebase
LncRNAWiki 2.0 1 LncRNAWiki 2.0 is devoted to community curation of human long non-coding RNAs (lncRNAs) to provide a comprehensive and up-to-date resource of functionally annotated lncRNAs. It incorporates a comprehensive collection of experimentally studied lncRNAs and integrates a wealth of their annotations based on a standardized curation model, and improves curation quality through expert curator review and community error report.
miRNASNP-v3 1 miRNASNP-v3 is a comprehensive database for SNPs and disease-related variations in miRNAs and miRNA targets
TWAS Atlas 1 Transcriptome-Wide Association Studies Atlas
OpenLB 93,519 Open Library of Bioscience
Genbase Nucleotide 39,630 a collection of nucleotide sequences from several sources
Genbase Protein 18,659 a collection of protein sequences from several sources
Database Records Number Description

Powered by EBISearch

Database Records Number Description

Powered by NCBI Entrez

Database Records Number Description

Powered by EBI AlphaFold DB