BIG Search

BIG Search is a scalable text search engine built based on ElasticSearch (a highly scalable open-source full-text search and analytics engine based on Apache Lucene). It features cross-domain search and facilitates users to gain access to a wide range of biomedical data, not only from NGDC databases but also partner databases throughout the world.

e.g., PRJCA000126;SAMC000385;tp53;EGFR; human; KaKs_Calculator

543,722,165 records from 14 NGDC & Partner databases.

Database Records Number Description
DMS_MeSH 12,060 MeSH (Medical Subject Headings) is the NLM controlled vocabulary thesaurus used for indexing articles for PubMed.
BioProject 2,683 Biological Project Library
GSA 1,244 Genome Sequence Archive
Database Commons 46 a curated catalogue of biological databases.
BioCode 27 Archive Bioinformatics Codes for Open Source Projects
BioSample 11 Biological Sample Library
KGCoV 2 KGCoV(Knowledge Graph of SARS-CoV-2) structures and matches COVID-19 epidemiological information and SARS-CoV-2 genomic data with combined curation methods, and integrates variation information generated by bioinformatic tools.
SequenceOntology 1 The Sequence Ontology is a set of terms and relationships used to describe the features and attributes of biological sequence. SO includes different kinds of features which can be located on the sequence. Biological features are those which are defined by their disposition to be involved in a biological process.
AnimalTFDB 1 Animal Transcription Factor Database
LncRNAWiki 2.0 1 LncRNAWiki 2.0 is devoted to community curation of human long non-coding RNAs (lncRNAs) to provide a comprehensive and up-to-date resource of functionally annotated lncRNAs. It incorporates a comprehensive collection of experimentally studied lncRNAs and integrates a wealth of their annotations based on a standardized curation model, and improves curation quality through expert curator review and community error report.
OpenLB 223,209 Open Library of Bioscience
Genbase Nucleotide 266,838,718 a collection of nucleotide sequences from several sources
Genbase Protein 272,847,109 a collection of protein sequences from several sources
RCoV19 3,797,053 Resource for Coronavirus 2019
Database Records Number Description

Powered by EBISearch

Database Records Number Description

Powered by NCBI Entrez

Database Records Number Description

Powered by EBI AlphaFold DB