Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

SmProt

General information

URL: http://bioinfo.ibp.ac.cn/SmProt
Full name: Small proteins database
Description: SmProt incorporates 255 010 small proteins computationally or experimentally identified in 291 cell lines/tissues derived from eight popular species. The database provides a variety of data including basic information (sequence, location, gene name, organism, etc.) as well as specific information (experiment, function, disease type, etc.).
Year founded: 2018
Last update: 2021
Version: v2.0
Accessibility:
Accessible
Country/Region: China

Funding support

  • 2016YFC0901702)
  • National Natu81902519
  • 91940306
  • 31871294
  • 31701117
  • 31970647
  • 2017YFC0907503
  • 2016YFC0901002
  • 2018YFA0106901
  • XDB38040300
  • XXH13505-05
  • 2019FY100102

Classification & Tag

Data type:
Data object:
Database category:
Major species:
Keywords:

Contact information

University/Institution: Institute of Biophysics, Chinese Academy of Sciences
Address: Institute of Biophysics, Chinese Academy of Sciences, Beijing, China.
City: Beijing
Province/State: Beijing
Country/Region: China
Contact name (PI/Team): Runsheng Chen
Contact email (PI/Helpdesk): rschen@ibp.ac.cn

Publications

34536568
SmProt: A Reliable Repository with Comprehensive Annotation of Small Proteins Identified from Ribosome Profiling. [PMID: 34536568]
Li Y, Zhou H, Chen X, Zheng Y, Kang Q, Hao D, Zhang L, Song T, Luo H, Hao Y, Chen R, Zhang P, He S.

Small proteins specifically refer to proteins consisting of less than 100 amino acids translated from small open reading frames (sORFs), which were usually missed in previous genome annotation. The significance of small proteins has been revealed in current years, along with the discovery of their diverse functions. However, systematic annotation of small proteins is still insufficient. SmProt was specially developed to provide valuable information on small proteins for scientific community. Here we present the update of SmProt, which emphasizes reliability of translated sORFs, genetic variants in translated sORFs, disease-specific sORF translation events or sequences, and remarkably increased data volume. More components such as non-ATG translation initiation, function, and new sources are also included. SmProt incorporated 638,958 unique small proteins curated from 3,165,229 primary records, which were computationally predicted from 419 ribosome profiling (Ribo-seq) datasets or collected from literature and other sources from 370 cell lines or tissues in 8 species (Homo sapiens, Mus musculus, Rattus norvegicus, Drosophila melanogaster, Danio rerio, Saccharomyces cerevisiae, Caenorhabditis elegans, and Escherichia coli). In addition, small protein families identified from human microbiomes were also collected. All datasets in SmProt are free to access, and available for browse, search, and bulk downloads at http://bigdata.ibp.ac.cn/SmProt/.

Genomics Proteomics Bioinformatics. 2021:19(4) | 64 Citations (from Europe PMC, 2025-12-13)
28137767
SmProt: a database of small proteins encoded by annotated coding and non-coding RNA loci. [PMID: 28137767]
Hao Y, Zhang L, Niu Y, Cai T, Luo J, He S, Zhang B, Zhang D, Qin Y, Yang F, Chen R.

Small proteins is the general term for proteins with length shorter than 100 amino acids. Identification and functional studies of small proteins have advanced rapidly in recent years, and several studies have shown that small proteins play important roles in diverse functions including development, muscle contraction and DNA repair. Identification and characterization of previously unrecognized small proteins may contribute in important ways to cell biology and human health. Current databases are generally somewhat deficient in that they have either not collected small proteins systematically, or contain only predictions of small proteins in a limited number of tissues and species. Here, we present a specifically designed web-accessible database, small proteins database (SmProt, http://bioinfo.ibp.ac.cn/SmProt), which is a database documenting small proteins. The current release of SmProt incorporates 255 010 small proteins computationally or experimentally identified in 291 cell lines/tissues derived from eight popular species. The database provides a variety of data including basic information (sequence, location, gene name, organism, etc.) as well as specific information (experiment, function, disease type, etc.). To facilitate data extraction, SmProt supports multiple search options, including species, genome location, gene name and their aliases, cell lines/tissues, ORF type, gene type, PubMed ID and SmProt ID. SmProt also incorporates a service for the BLAST alignment search and provides a local UCSC Genome Browser. Additionally, SmProt defines a high-confidence set of small proteins and predicts the functions of the small proteins.

Brief Bioinform. 2018:19(4) | 103 Citations (from Europe PMC, 2025-12-13)

Ranking

All databases:
690/6895 (90.007%)
Raw bio-data:
53/582 (91.065%)
690
Total Rank
158
Citations
22.571
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2018-01-28
Curated by:
Di HAO [2023-07-26]
Pei Liu [2022-05-15]
Dong Zou [2019-04-28]
Saba Arshad [2018-04-16]
Yang Zhang [2018-01-28]