Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

ToCoDDB

General information

URL: http://cadd.zju.edu.cn/tocodecoy/
Full name: Topology-Based and Conformation-Based Decoys Database
Description: ToCoDDB is an unbiased database for the training and benchmarking of machine-learning scoring functions, providing not only 155 target-specific datasets but also a decoys generation interface.
Year founded: 2023
Last update:
Version: v1.0
Accessibility:
Accessible
Country/Region: China

Classification & Tag

Data type:
Data object:
NA
Database category:
Major species:
NA
Keywords:

Contact information

University/Institution: Zhejiang University
Address: Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.
City: Hangzhou
Province/State: Zhejiang
Country/Region: China
Contact name (PI/Team): Zhe Wang
Contact email (PI/Helpdesk): wangzhehyd@zju.edu.cn

Publications

37317043
Topology-Based and Conformation-Based Decoys Database: An Unbiased Online Database for Training and Benchmarking Machine-Learning Scoring Functions. [PMID: 37317043]
Xujun Zhang, Chao Shen, Tianyue Wang, Yu Kang, Dan Li, Peichen Pan, Jike Wang, Gaoang Wang, Yafeng Deng, Lei Xu, Dongsheng Cao, Tingjun Hou, Zhe Wang

Machine-learning-based scoring functions (MLSFs) have gained attention for their potential to improve accuracy in binding affinity prediction and structure-based virtual screening (SBVS) compared to classical SFs. Developing accurate MLSFs for SBVS requires a large and unbiased dataset that includes structurally diverse actives and decoys. Unfortunately, most datasets suffer from hidden biases and data insufficiency. Here, we developed topology-based and conformation-based decoys database (ToCoDDB). The biological targets and active ligands in ToCoDDB were collected from scientific literature and established datasets. The decoys were generated and debiased by using conditional recurrent neural networks and molecular docking. ToCoDDB is presently the largest unbiased database with 2.4 million decoys encompassing 155 targets. The detailed information and performance benchmark for each target are provided, which are beneficial for training and evaluating MLSFs. Moreover, the online decoys generation function of ToCoDDB further expands its application range to any target. ToCoDDB is freely available at http://cadd.zju.edu.cn/tocodecoy/.

J Med Chem. 2023:66(13) | 1 Citations (from Europe PMC, 2025-12-13)

Ranking

All databases:
6137/6895 (11.008%)
Structure:
858/967 (11.375%)
Interaction:
1083/1194 (9.38%)
6137
Total Rank
1
Citations
0.5
z-index

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Citing
Cited by

Record metadata

Created on: 2023-08-23
Curated by:
shaosen zhang [2024-08-22]
Yuxin Qin [2023-09-27]
Xinyu Zhou [2023-09-11]
Yuxin Qin [2023-08-23]