BIG Search

BIG Search is a scalable text search engine built based on ElasticSearch (a highly scalable open-source full-text search and analytics engine based on Apache Lucene). It features cross-domain search and facilitates users to gain access to a wide range of biomedical data, not only from NGDC databases but also partner databases throughout the world.

e.g., PRJCA000126;SAMC000385;tp53;EGFR; human; KaKs_Calculator

50,142 records from 66 NGDC & Partner databases.

Database Records Number Description
GSA 3,835 Genome Sequence Archive
VarClear 2,252 Gene Variation Interpretation Database
BioProject 988 Biological Project Library
BioSample 837 Biological Sample Library
DMS_PMO 721 a standardized ontology for human precision medicine with consistent, reusable and sustainable descriptions of human disease terms, genomic molecular, phenotype characteristics and related medical vocabulary disease concepts through collaborative efforts of researchers at Institute of Medical Information, Chinese Academy of Medical Sciences.
HGD 172 Homologous Gene Database
RMVar 172 RNA Modification associated variants database
AnimalTFDB 132 Animal Transcription Factor Database
Animal-SNPAtlas 113 Animal-SNPAtlas: a comprehensive SNP database for multiple animals.
iEKPD 91 Integrated annotations for Eukaryotic protein Kinases, protein Phosphatases & phosphoprotein-binding Domains
DMS_Ensembl 62 Ensembl supports research in comparative genomics, evolution, sequence variation and transcriptional regulation. Ensembl annotate genes, computes multiple alignments, predicts regulatory function and collects disease data.
RhesusBase Genes 44
MethBank 4.0 43 a database of DNA methylation across a variety of species
DMS_ProteinOntology 38 PRO provides an ontological representation of protein-related entities by explicitly defining them and showing the relationships between them. Each PRO term represents a distinct class of entities (including specific modified forms, orthologous isoforms, and protein complexes) ranging from the taxon-neutral to the taxon-specific (e.g. the entity representing all protein products of the human SMAD2 gene is described in PR:Q15796; one particular human SMAD2 protein form, phosphorylated on the last two serines of a conserved C-terminal SSxS motif is defined by PR:000025934).
DEG 36 Database of Essential Genes
DrLLPS 36 Data resource of liquid-liquid phase separation
Pancan-MNVQTLdb 33 A database to evaluate the effects of MNVs on multiple molecular phenotypes
MethBank SRMs 32 Methbank, Single-base Resolution Methylomes (SRMs)
Gene Expression Nebulas 30 A data portal of transcriptomic profiles across multiple species
Pancan-meQTL 30 A database to evaluate the effects of SNPs on methylation.
VCG 25 Virtual Chinese Genome Database is a dynamic genome database of Chinese population.
DMS_Swissprot 24 UniProtKB/Swiss-Prot is the expertly curated component of UniProtKB (produced by the UniProt consortium). It contains hundreds of thousands of protein descriptions, including function, domain structure, subcellular location, post-translational modifications and functionally characterized variants.
NucMap 23 A database of genome-wide nucleosome positioning map across species.
MiCroKiTS 22 Midbody, Centrosome, Kinetochore, Telomere and Spindle
EPSD 21 Eukaryotic Phosphorylation Site Database
hTFtarget 19 In this hTFtarget database, we collected comprehensive human TF ChIP-Seq data and customized an analysis workflow to identify reliable TF targets with taking epigenomic states into account
HGNC 18 The HGNC is responsible for approving unique symbols and names for human loci, including protein coding genes, ncRNA genes and pseudogenes, to allow unambiguous scientific communication.
GenTree 15 GenTree, the time tree of genes along the evolutionary history
NODE 13 The National Omics Data Encyclopedia
lnCAR 10 lnCAR | A comprehensive resource for lncRNAs from Cancer Arrays
TCOD 10 A multi-omics data platform for tropical crops
dbPAF 7 database of Phospho-sites in Animals and Fungi
DMS_MeSH 6 MeSH (Medical Subject Headings) is the NLM controlled vocabulary thesaurus used for indexing articles for PubMed.
ASCancer Atlas 6 A comprehensive knowledgebase of alternative splicing in human cancers
BioCode 5 Archive Bioinformatics Codes for Open Source Projects
Database Commons 5 a curated catalogue of biological databases.
Methbank CRMs 5 Methbank, Consensus Reference Methylomes (CRMs)
PancanQTL 5 A database to systematically identify cis-eQTLs and trans-eQTLs in 33 cancer types.
PLMD 5 Protein Lysine Modifications Database
DMS_SnomedCT_US 4 The SNOMED CT United States (US) Edition is the official source of SNOMED CT for use in US healthcare systems. The US Edition is a standalone release that combines the content of both the US Extension and the International releases of SNOMED CT.
LeukemiaDB 4 LeukemiaDB collects 3068 samples in 188 leukemia-associated RNA-seq datasets from NCBI GEO and SRA.
BBCancer 4 BBCancer: an expression atlas of blood-based biomarkers in the early diagnosis of cancers
GSA for Human 4 Genome Sequence Archive for Human
PTMD 4 A database of human disease-associated post-translational modifications
animalAPAdb 3 A comprehensive animal alternative polyadenylation database
animaleRNAdb 3 A comprehensive animal enhancer RNA database
CGGA 1 Chinese Glioma Genome Atlas
DiseaseEnhancer 1 DiseaseEnhancer: a resource of human disease-associated enhancer catalog.
EWAS Atlas 1 A knowledgebase of epigenome-wide association studies
EWAS Data Hub 1 A data hub of DNA methylation array data and metadata
BrainBase 1 Brain Disease Knowledgebase
CancerSEA 1 CancerSEA: a cancer single-cell state atlas
Cell Taxonomy 1 Cell Taxonomy is a curated repository of cell types with multifaceted characterization.
CGDB 1 Circadian Gene Database
EDK 1 Editome Disease Knowledgebase
GVM 1 Genome Variation Map
Platelets expression atlas 1 Platelet Expression Atlas (PEA) is a comprehensive expression resource and functional analysis platform for human platelets
lncRNASNP v3 1 lncRNASNP v3: a comprehensive resources for functional variants in long non-coding RNAs.
lncRNASNP2 1
LncRNAWiki 2.0 1 LncRNAWiki 2.0 is devoted to community curation of human long non-coding RNAs (lncRNAs) to provide a comprehensive and up-to-date resource of functionally annotated lncRNAs. It incorporates a comprehensive collection of experimentally studied lncRNAs and integrates a wealth of their annotations based on a standardized curation model, and improves curation quality through expert curator review and community error report.
miRNASNP-v3 1 miRNASNP-v3 is a comprehensive database for SNPs and disease-related variations in miRNAs and miRNA targets
TWAS Atlas 1 Transcriptome-Wide Association Studies Atlas
OpenLB 31,258 Open Library of Bioscience
Genbase Nucleotide 5,705 a collection of nucleotide sequences from several sources
Genbase Protein 3,195 a collection of protein sequences from several sources
Database Records Number Description

Powered by EBISearch

Database Records Number Description

Powered by NCBI Entrez

Database Records Number Description

Powered by EBI AlphaFold DB