Database Commons
Database Commons

a catalog of worldwide biological databases

Database Profile

General information

Full name: DataBase for automated Carbohydrate-active enzyme ANnotation
Description: dbCAN3 server is a web server for automated Carbohydrate-active enzyme ANnotation, funded by the NSF (DBI-1933521) and NIH (R01GM140370). Similar resources on the web include CAZy, CAT (obsolete), and CUPP. dbCAN3 server can identify transcription factors (TFs), transporters (TCs), signal transduction proteins (STPs), and further CAZyme gene clusters (CGCs) using CGC-Finder if users submit faa+gff files or fna file dbCAN3 server combines the results from the three tools and allows visualization of detailed results as tables/graphs.
Year founded: 2012
Last update: 2019/07/10
Version: v.3
Real time : Checking...
Country/Region: United States

Classification & Tag

Data type:
Data object:
Database category:
Major species:

Contact information

University/Institution: Northern Illinois University
Address: Northern Illinois University, DeKalb, IL, USA
Country/Region: United States
Contact name (PI/Team): Yanbin Yin
Contact email (PI/Helpdesk):


dbCAN3: automated carbohydrate-active enzyme and substrate annotation. [PMID: 37125649]
Jinfang Zheng, Qiwei Ge, Yuchen Yan, Xinpeng Zhang, Le Huang, Yanbin Yin

Carbohydrate active enzymes (CAZymes) are made by various organisms for complex carbohydrate metabolism. Genome mining of CAZymes has become a routine data analysis in (meta-)genome projects, owing to the importance of CAZymes in bioenergy, microbiome, nutrition, agriculture, and global carbon recycling. In 2012, dbCAN was provided as an online web server for automated CAZyme annotation. dbCAN2 ( was further developed in 2018 as a meta server to combine multiple tools for improved CAZyme annotation. dbCAN2 also included CGC-Finder, a tool for identifying CAZyme gene clusters (CGCs) in (meta-)genomes. We have updated the meta server to dbCAN3 with the following new functions and components: (i) dbCAN-sub as a profile Hidden Markov Model database (HMMdb) for substrate prediction at the CAZyme subfamily level; (ii) searching against experimentally characterized polysaccharide utilization loci (PULs) with known glycan substates of the dbCAN-PUL database for substrate prediction at the CGC level; (iii) a majority voting method to consider all CAZymes with substrate predicted from dbCAN-sub for substrate prediction at the CGC level; (iv) improved data browsing and visualization of substrate prediction results on the website. In summary, dbCAN3 not only inherits all the functions of dbCAN2, but also integrates three new methods for glycan substrate prediction.

Nucleic Acids Res. 2023:51(W1) | 37 Citations (from Europe PMC, 2024-05-18)
dbCAN2: a meta server for automated carbohydrate-active enzyme annotation. [PMID: 29771380]
Zhang H, Yohe T, Huang L, Entwistle S, Wu P, Yang Z, Busk PK, Xu Y, Yin Y.

Complex carbohydrates of plants are the main food sources of animals and microbes, and serve as promising renewable feedstock for biofuel and biomaterial production. Carbohydrate active enzymes (CAZymes) are the most important enzymes for complex carbohydrate metabolism. With an increasing number of plant and plant-associated microbial genomes and metagenomes being sequenced, there is an urgent need of automatic tools for genomic data mining of CAZymes. We developed the dbCAN web server in 2012 to provide a public service for automated CAZyme annotation for newly sequenced genomes. Here, dbCAN2 ( is presented as an updated meta server, which integrates three state-of-the-art tools for CAZome (all CAZymes of a genome) annotation: (i) HMMER search against the dbCAN HMM (hidden Markov model) database; (ii) DIAMOND search against the CAZy pre-annotated CAZyme sequence database and (iii) Hotpep search against the conserved CAZyme short peptide database. Combining the three outputs and removing CAZymes found by only one tool can significantly improve the CAZome annotation accuracy. In addition, dbCAN2 now also accepts nucleotide sequence submission, and offers the service to predict physically linked CAZyme gene clusters (CGCs), which will be a very useful online tool for identifying putative polysaccharide utilization loci (PULs) in microbial genomes or metagenomes.

Nucleic Acids Res. 2018:46(W1) | 880 Citations (from Europe PMC, 2024-05-18)
dbCAN: a web resource for automated carbohydrate-active enzyme annotation. [PMID: 22645317]
Yin Y, Mao X, Yang J, Chen X, Mao F, Xu Y.

Carbohydrate-active enzymes (CAZymes) are very important to the biotech industry, particularly the emerging biofuel industry because CAZymes are responsible for the synthesis, degradation and modification of all the carbohydrates on Earth. We have developed a web resource, dbCAN (, to provide a capability for automated CAZyme signature domain-based annotation for any given protein data set (e.g. proteins from a newly sequenced genome) submitted to our server. To accomplish this, we have explicitly defined a signature domain for every CAZyme family, derived based on the CDD (conserved domain database) search and literature curation. We have also constructed a hidden Markov model to represent the signature domain of each CAZyme family. These CAZyme family-specific HMMs are our key contribution and the foundation for the automated CAZyme annotation.

Nucleic Acids Res. 2012:40(Web Server issue) | 937 Citations (from Europe PMC, 2024-05-18)


All databases:
79/6000 (98.7%)
Gene genome and annotation:
32/1675 (98.149%)
Total Rank

Community reviews

Not Rated
Data quality & quantity:
Content organization & presentation
System accessibility & reliability:

Word cloud

Related Databases

Cited by

Record metadata

Created on: 2018-01-28
Curated by:
Xinyu Zhou [2023-08-28]
Shoaib Saleem [2019-11-25]
Shoaib Saleem [2019-11-19]
Rabail Raza [2018-12-26]
Pei Wang [2018-03-21]
Pei Wang [2018-03-11]
Pei Wang [2018-02-23]
Hao Zhang [2018-01-28]