Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

LigCys3D

General information

URL: https://ligcys.computchem.org
Full name: Ligandable Cysteines in Three-Dimensional Structures
Description: LigCys3D is a comprehensive database containing approximately 1000 covalently ligandable cysteines, involving around 800 proteins and about 10,000 three-dimensional structures. This database aims to assist in the discovery and development of covalent drugs, providing extensive protein structure information and ligand binding site data.
Year founded: 2024
Last update: 2024-04-05
Version: v1.0
Accessibility:
Accessible
Country/Region: United States

Classification & Tag

Data type:
Data object:
Database category:
Major species:
Keywords:

Contact information

University/Institution: University of Maryland School of Pharmacy
Address:
City:
Province/State:
Country/Region: United States
Contact name (PI/Team): Jana Shen
Contact email (PI/Helpdesk): ana.shen@rx.umaryland.edu

Publications

38665640
Machine Learning Models to Interrogate Proteome-Wide Covalent Ligandabilities Directed at Cysteines. [PMID: 38665640]
Ruibin Liu, Joseph Clayton, Mingzhe Shen, Shubham Bhatnagar, Jana Shen

Machine learning (ML) identification of covalently ligandable sites may accelerate targeted covalent inhibitor design and help expand the druggable proteome space. Here, we report the rigorous development and validation of the tree-based models and convolutional neural networks (CNNs) trained on a newly curated database (LigCys3D) of over 1000 liganded cysteines in nearly 800 proteins represented by over 10,000 three-dimensional structures in the protein data bank. The unseen tests yielded 94 and 93% area under the receiver operating characteristic curves for the tree models and CNNs, respectively. Based on the AlphaFold2 predicted structures, the ML models recapitulated the newly liganded cysteines in the PDB with over 90% recall values. To assist the community of covalent drug discoveries, we report the predicted ligandable cysteines in 392 human kinases and their locations in the sequence-aligned kinase structure, including the PH and SH2 domains. Furthermore, we disseminate a searchable online database LigCys3D (https://ligcys.computchem.org/) and a web prediction server DeepCys (https://deepcys.computchem.org/), both of which will be continuously updated and improved by including newly published experimental data. The present work represents an important step toward the ML-led integration of big genome data and structure models to annotate the human proteome space for the next-generation covalent drug discoveries.

JACS Au. 2024:4(4) | 13 Citations (from Europe PMC, 2026-06-13)

Ranking

All databases:
2228/6932 (67.874%)
Structure:
315/972 (67.695%)
Health and medicine:
547/1756 (68.907%)
2228
Total Rank
10
Citations
5
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2024-07-15
Curated by:
Wenzhuo Cheng [2024-08-23]
Shiting Wang [2024-08-21]
Shiting Wang [2024-07-23]
shaosen zhang [2024-07-16]
Shiting Wang [2024-07-15]