Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

General information

Full name:
Description: PubChem is a public repository for information on chemical substances and their biological activities, launched in 2004 as a component of the Molecular Libraries Roadmap Initiatives of the US National Institutes of Health (NIH).
Year founded: 2009
Last update: 2017-01-01
Real time : Checking...
Country/Region: United States

Classification & Tag

Data type:
Data object:
Database category:
Major species:

Contact information

University/Institution: National Center for Biotechnology Information
Address: Bethesda, Maryland 20894, USA
City: Bethesda
Province/State: Maryland
Country/Region: United States
Contact name (PI/Team): Yanli Wang
Contact email (PI/Helpdesk):


PubChem 2023 update. [PMID: 36305812]
Sunghwan Kim, Jie Chen, Tiejun Cheng, Asta Gindulyte, Jia He, Siqian He, Qingliang Li, Benjamin A Shoemaker, Paul A Thiessen, Bo Yu, Leonid Zaslavsky, Jian Zhang, Evan E Bolton

PubChem ( is a popular chemical information resource that serves a wide range of use cases. In the past two years, a number of changes were made to PubChem. Data from more than 120 data sources was added to PubChem. Some major highlights include: the integration of Google Patents data into PubChem, which greatly expanded the coverage of the PubChem Patent data collection; the creation of the Cell Line and Taxonomy data collections, which provide quick and easy access to chemical information for a given cell line and taxon, respectively; and the update of the bioassay data model. In addition, new functionalities were added to the PubChem programmatic access protocols, PUG-REST and PUG-View, including support for target-centric data download for a given protein, gene, pathway, cell line, and taxon and the addition of the 'standardize' option to PUG-REST, which returns the standardized form of an input chemical structure. A significant update was also made to PubChemRDF. The present paper provides an overview of these changes.

Nucleic Acids Res. 2023:51(D1) | 369 Citations (from Europe PMC, 2024-09-07)
PubChem Protein, Gene, Pathway, and Taxonomy Data Collections: Bridging Biology and Chemistry through Target-Centric Views of PubChem Data. [PMID: 35227770]
Sunghwan Kim, Tiejun Cheng, Siqian He, Paul A Thiessen, Qingliang Li, Asta Gindulyte, Evan E Bolton

PubChem ( is a public chemical database at the U.S. National Institutes of Health. Visited by millions of users every month, it plays a role as a key chemical information resource for biomedical research communities. Data in PubChem is from hundreds of contributors and organized into multiple collections by record type. Among these are the Protein, Gene, Pathway, and Taxonomy data collections. Records in these collections contain information on chemicals related to a given biological target (i.e., protein, gene, pathway, or taxon), helping users to analyze and interpret the biological activity data of molecules. In addition, annotations about the biological targets are collected from authoritative or curated data sources and integrated into the four collections. The content can be programmatically accessed through PubChem's web service interfaces (including PUG View). A machine-readable representation of this content is also provided within PubChemRDF.

J Mol Biol. 2022:() | 21 Citations (from Europe PMC, 2024-09-07)
PubChem in 2021: new data content and improved web interfaces. [PMID: 33151290]
Sunghwan Kim, Jie Chen, Tiejun Cheng, Asta Gindulyte, Jia He, Siqian He, Qingliang Li, Benjamin A Shoemaker, Paul A Thiessen, Bo Yu, Leonid Zaslavsky, Jian Zhang, Evan E Bolton

PubChem ( is a popular chemical information resource that serves the scientific community as well as the general public, with millions of unique users per month. In the past two years, PubChem made substantial improvements. Data from more than 100 new data sources were added to PubChem, including chemical-literature links from Thieme Chemistry, chemical and physical property links from SpringerMaterials, and patent links from the World Intellectual Properties Organization (WIPO). PubChem's homepage and individual record pages were updated to help users find desired information faster. This update involved a data model change for the data objects used by these pages as well as by programmatic users. Several new services were introduced, including the PubChem Periodic Table and Element pages, Pathway pages, and Knowledge panels. Additionally, in response to the coronavirus disease 2019 (COVID-19) outbreak, PubChem created a special data collection that contains PubChem data related to COVID-19 and the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).

Nucleic Acids Res. 2021:49(D1) | 1146 Citations (from Europe PMC, 2024-09-07)
Exploring Chemical Information in PubChem. [PMID: 34370395]
Sunghwan Kim

PubChem ( is a public chemical database that serves scientific communities as well as the general public. This database collects chemical information from hundreds of data sources and organizes them into multiple data collections, including Substance, Compound, BioAssay, Protein, Gene, Pathway, and Patent. These collections are interlinked with each other, allowing users to discover related records in the various collections (e.g., drugs targeting a protein or genes modulated by a chemical). PubChem can be searched by keyword (e.g., a chemical, protein, or gene name) as well as by chemical structure. The input structure can be provided using popular line notations or drawn with the PubChem Sketcher. PubChem supports various types of structure searches, including identity search, 2-D and 3-D similarity searches, and substructure and superstructure searches. Results from multiple searches can be combined using Boolean operators (i.e., AND, OR, and NOT) to formulate complex queries. PubChem allows the user to quickly retrieve a list of records annotated with a particular classification or ontological term. This paper provides step-by-step instructions on how to explore PubChem data with examples of commonly requested tasks. © 2021. This article is a U.S. Government work and is in the public domain in the USA. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Finding genes and proteins that interact with a given compound Basic Protocol 2: Finding drug-like compounds similar to a query compound through a two-dimensional (2-D) similarity search Basic Protocol 3: Finding compounds similar to a query compound through a three-dimensional (3-D) similarity search Support Protocol: Computing similarity scores between compounds Basic Protocol 4: Getting the bioactivity data for the hit compounds from substructure search Basic Protocol 5: Finding drugs that target a particular gene Basic Protocol 6: Getting bioactivity data of all chemicals tested against a protein. Basic Protocol 7: Finding compounds annotated with classifications or ontological terms Basic Protocol 8: Finding stereoisomers and isotopomers of a compound through identity search.

Curr Protoc. 2021:1(8) | 24 Citations (from Europe PMC, 2024-09-07)
PubChem Periodic Table and Element Pages: Improving Access to Information on Chemical Elements from Authoritative Sources. [PMID: 34268481]
Sunghwan Kim, Asta Gindulyte, Jian Zhang, Paul A Thiessen, Evan E Bolton

PubChem ( is one of the top five most visited chemistry web sites in the world, with more than five million unique users per month (as of March 2020). Many of these users are educators, undergraduate students, and graduate students at academic institutions. Therefore, PubChem has a great potential as an online resource for chemical education. This paper describes the PubChem Periodic Table and Element pages, which were recently introduced to celebrate the 150th anniversary of the periodic table. These services help users navigate the abundant chemical element data available within PubChem, while providing a convenient entry point to explore additional chemical content, such as biological activities and health and safety data available in PubChem Compound pages for specific elements and their isotopes. The PubChem Periodic Table and Element pages are also available as widgets, which enable web developers to display PubChem's element data on web pages they design. The elemental data can be downloaded in common file formats and imported into data analysis programs (e.g., spreadsheet software, like Microsoft Excel and Google Sheets, and computer scripts, such as python and R). Overall, the PubChem Periodic Table and Element pages improve access to chemical element data from authoritative sources.

Chem Teach Int. 2021:3(1) | 9 Citations (from Europe PMC, 2024-09-07)
PubChem 2019 update: improved access to chemical data. [PMID: 30371825]
Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, Li Q, Shoemaker BA, Thiessen PA, Yu B, Zaslavsky L, Zhang J, Bolton EE.

PubChem ( is a key chemical information resource for the biomedical research community. Substantial improvements were made in the past few years. New data content was added, including spectral information, scientific articles mentioning chemicals, and information for food and agricultural chemicals. PubChem released new web interfaces, such as PubChem Target View page, Sources page, Bioactivity dyad pages and Patent View page. PubChem also released a major update to PubChem Widgets and introduced a new programmatic access interface, called PUG-View. This paper describes these new developments in PubChem.

Nucleic Acids Res. 2019:47(D1) | 1008 Citations (from Europe PMC, 2024-09-07)
An update on PUG-REST: RESTful interface for programmatic access to PubChem. [PMID: 29718389]
Kim S, Thiessen PA, Cheng T, Yu B, Bolton EE.

PubChem ( is one of the largest open chemical information resources available. It currently receives millions of unique users per month on average, serving as a key resource for many research fields such as cheminformatics, chemical biology, medicinal chemistry, and drug discovery. PubChem provides multiple programmatic access routes to its data and services. One of them is PUG-REST, a Representational State Transfer (REST)-like web service interface to PubChem. On average, PUG-REST receives more than a million requests per day from tens of thousands of unique users. The present paper provides an update on PUG-REST since our previous paper published in 2015. This includes access to new kinds of data (e.g. concise bioactivity data, table of contents headings, etc.), full implementation of synchronous fast structure search, support for assay data retrieval using accession identifiers in response to the deprecation of NCBI's GI numbers, data exchange between PUG-REST and NCBI's E-Utilities through the List Gateway, implementation of dynamic traffic control through throttling, and enhanced usage policies. In addition, example Perl scripts are provided, which the user can easily modify, run, or translate into another scripting language.

Nucleic Acids Res. 2018:46(W1) | 43 Citations (from Europe PMC, 2024-09-07)
Finding Potential Multitarget Ligands Using PubChem. [PMID: 30334203]
Kim S, Shoemaker BA, Bolton EE, Bryant SH.

PubChem ( ) is a key chemical information resource, developed and maintained by the US National Institutes of Health. The present chapter describes how to find potential multitarget ligands from PubChem that would be tested in further experiments. While the protocol presented here uses PubChem's Web-based interfaces to allow users to follow it interactively, it can also be implemented in computer software by using programmatic access interfaces to PubChem (such as PUG-REST or E-Utilities).

Methods Mol Biol. 2018:1825() | 9 Citations (from Europe PMC, 2024-09-07)
PubChem BioAssay: 2017 update. [PMID: 27899599]
Wang Y, Bryant SH, Cheng T, Wang J, Gindulyte A, Shoemaker BA, Thiessen PA, He S, Zhang J.

PubChem's BioAssay database ( has served as a public repository for small-molecule and RNAi screening data since 2004 providing open access of its data content to the community. PubChem accepts data submission from worldwide researchers at academia, industry and government agencies. PubChem also collaborates with other chemical biology database stakeholders with data exchange. With over a decade's development effort, it becomes an important information resource supporting drug discovery and chemical biology research. To facilitate data discovery, PubChem is integrated with all other databases at NCBI. In this work, we provide an update for the PubChem BioAssay database describing several recent development including added sources of research data, redesigned BioAssay record page, new BioAssay classification browser and new features in the Upload system facilitating data sharing. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

Nucleic Acids Res. 2017:45(D1) | 225 Citations (from Europe PMC, 2024-09-07)
PubChem Substance and Compound databases. [PMID: 26400175]
Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, Wang J, Yu B, Zhang J, Bryant SH.

PubChem ( is a public repository for information on chemical substances and their biological activities, launched in 2004 as a component of the Molecular Libraries Roadmap Initiatives of the US National Institutes of Health (NIH). For the past 11 years, PubChem has grown to a sizable system, serving as a chemical information resource for the scientific research community. PubChem consists of three inter-linked databases, Substance, Compound and BioAssay. The Substance database contains chemical information deposited by individual data contributors to PubChem, and the Compound database stores unique chemical structures extracted from the Substance database. Biological activity data of chemical substances tested in assay experiments are contained in the BioAssay database. This paper provides an overview of the PubChem Substance and Compound databases, including data sources and contents, data organization, data submission using PubChem Upload, chemical structure standardization, web-based interfaces for textual and non-textual searches, and programmatic access. It also gives a brief description of PubChem3D, a resource derived from theoretical three-dimensional structures of compounds in PubChem, as well as PubChemRDF, Resource Description Framework (RDF)-formatted PubChem data for data sharing, analysis and integration with information contained in other databases. Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.

Nucleic Acids Res. 2016:44(D1) | 1610 Citations (from Europe PMC, 2024-09-07)
PubChem BioAssay: 2014 update. [PMID: 24198245]
Wang Y, Suzek T, Zhang J, Wang J, He S, Cheng T, Shoemaker BA, Gindulyte A, Bryant SH.

PubChem's BioAssay database ( is a public repository for archiving biological tests of small molecules generated through high-throughput screening experiments, medicinal chemistry studies, chemical biology research and drug discovery programs. In addition, the BioAssay database contains data from high-throughput RNA interference screening aimed at identifying critical genes responsible for a biological process or disease condition. The mission of PubChem is to serve the community by providing free and easy access to all deposited data. To this end, PubChem BioAssay is integrated into the National Center for Biotechnology Information retrieval system, making them searchable by Entrez queries and cross-linked to other biomedical information archived at National Center for Biotechnology Information. Moreover, PubChem BioAssay provides web-based and programmatic tools allowing users to search, access and analyze bioassay test results and metadata. In this work, we provide an update for the PubChem BioAssay resource, such as information content growth, new developments supporting data integration and search, and the recently deployed PubChem Upload to streamline chemical structure and bioassay submissions.

Nucleic Acids Res. 2014:42(Database issue) | 134 Citations (from Europe PMC, 2024-09-07)
PubChem's BioAssay Database. [PMID: 22140110]
Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Zhou Z, Han L, Karapetyan K, Dracheva S, Shoemaker BA, Bolton E, Gindulyte A, Bryant SH.

PubChem ( is a public repository for biological activity data of small molecules and RNAi reagents. The mission of PubChem is to deliver free and easy access to all deposited data, and to provide intuitive data analysis tools. The PubChem BioAssay database currently contains 500,000 descriptions of assay protocols, covering 5000 protein targets, 30,000 gene targets and providing over 130 million bioactivity outcomes. PubChem's bioassay data are integrated into the NCBI Entrez information retrieval system, thus making PubChem data searchable and accessible by Entrez queries. Also, as a repository, PubChem constantly optimizes and develops its deposition system answering many demands of both high- and low-volume depositors. The PubChem information platform allows users to search, review and download bioassay description and data. The PubChem platform also enables researchers to collect, compare and analyze biological test results through web-based and programmatic tools. In this work, we provide an update for the PubChem BioAssay resource, including information content growth, data model extension and new developments of data submission, retrieval, analysis and download tools.

Nucleic Acids Res. 2012:40(Database issue) | 273 Citations (from Europe PMC, 2024-09-07)
An overview of the PubChem BioAssay resource. [PMID: 19933261]
Wang Y, Bolton E, Dracheva S, Karapetyan K, Shoemaker BA, Suzek TO, Wang J, Xiao J, Zhang J, Bryant SH.

The PubChem BioAssay database ( is a public repository for biological activities of small molecules and small interfering RNAs (siRNAs) hosted by the US National Institutes of Health (NIH). It archives experimental descriptions of assays and biological test results and makes the information freely accessible to the public. A PubChem BioAssay data entry includes an assay description, a summary and detailed test results. Each assay record is linked to the molecular target, whenever possible, and is cross-referenced to other National Center for Biotechnology Information (NCBI) database records. 'Related BioAssays' are identified by examining the assay target relationship and activity profile of commonly tested compounds. A key goal of PubChem BioAssay is to make the biological activity information easily accessible through the NCBI information retrieval system-Entrez, and various web-based PubChem services. An integrated suite of data analysis tools are available to optimize the utility of the chemical structure and biological activity information within PubChem, enabling researchers to aggregate, compare and analyze biological test results contributed by multiple organizations. In this work, we describe the PubChem BioAssay database, including data model, bioassay deposition and utilities that PubChem provides for searching, downloading and analyzing the biological activity information contained therein.

Nucleic Acids Res. 2010:38(Database issue) | 153 Citations (from Europe PMC, 2024-09-07)
PubChem as a public resource for drug discovery. [PMID: 20970519]
Li Q, Cheng T, Wang Y, Bryant SH.

PubChem is a public repository of small molecules and their biological properties. Currently, it contains more than 25 million unique chemical structures and 90 million bioactivity outcomes associated with several thousand macromolecular targets. To address the potential utility of this public resource for drug discovery, we systematically summarized the protein targets in PubChem by function, 3D structure and biological pathway. Moreover, we analyzed the potency, selectivity and promiscuity of the bioactive compounds identified for these biological targets, including the chemical probes generated by the NIH Molecular Libraries Program. As a public resource, PubChem lowers the barrier for researchers to advance the development of chemical tools for modulating biological processes and drug candidates for disease treatments. Published by Elsevier Ltd.

Drug Discov Today. 2010:15(23-24) | 153 Citations (from Europe PMC, 2024-09-07)
PubChem: a public information system for analyzing bioactivities of small molecules. [PMID: 19498078]
Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Bryant SH.

PubChem ( is a public repository for biological properties of small molecules hosted by the US National Institutes of Health (NIH). PubChem BioAssay database currently contains biological test results for more than 700 000 compounds. The goal of PubChem is to make this information easily accessible to biomedical researchers. In this work, we present a set of web servers to facilitate and optimize the utility of biological activity information within PubChem. These web-based services provide tools for rapid data retrieval, integration and comparison of biological screening results, exploratory structure-activity analysis, and target selectivity examination. This article reviews these bioactivity analysis tools and discusses their uses. Most of the tools described in this work can be directly accessed at URLs for accessing other tools described in this work are specified individually.

Nucleic Acids Res. 2009:37(Web Server issue) | 596 Citations (from Europe PMC, 2024-09-07)


All databases:
34/6264 (99.473%)
Health and medicine:
9/1496 (99.465%)
7/871 (99.311%)
Total Rank

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Cited by

Record metadata

Created on: 2018-01-03
Curated by:
Yuanyuan Cheng [2023-08-22]
Qianpeng Li [2022-04-24]
Sicheng Luo [2022-04-23]
Jing Wei [2022-04-18]
Shoaib Saleem [2019-10-28]
Dong Zou [2019-01-04]
Dong Zou [2019-01-03]
Lina Ma [2018-08-26]
Lina Ma [2018-06-05]
Shixiang Sun [2017-02-20]
Shixiang Sun [2016-03-28]
Mengwei Li [2016-02-13]
Lin Liu [2016-02-08]
Lin Liu [2016-02-02]
Lin Liu [2016-01-17]