NetBCE An Interpretable Deep Neural Network for Accurate Prediction of Linear B-Cell Epitopes
Introduction
Identification of B-cell epitopes (BCEs) plays an essential role in the development of peptide vaccines, immuno-diagnostic reagents, and antibody design and production. In this work, we generated a large benchmark dataset comprising 126,779 experimentally-supported, linear epitope-containing regions in 3567 protein clusters from over 1.3 million B cell assays. Analysis of this curated dataset showed large pathogen diversity covering 176 different families. The accuracy in linear BCE prediction was found to strongly vary with different features, while all sequence and structural features were informative. To search more efficient and interpretive feature representations, a ten-layer deep learning framework for linear BCE prediction, namely NetBCE, was developed. NetBCE achieved high accuracy and robust performance with the average area under the curve (AUC) value of 0.846 in five-fold cross validation through automatically learning the informative classification features. NetBCE substantially outperformed the conventional machine learning algorithms and other tools, with an over 22.06% improvement of AUC value compared to other tools using an independent dataset. Through investigating the output of important network modules in NetBCE, epitopes and non-epitopes tended to present in distinct regions with efficient feature representation along the network layer hierarchy. The NetBCE is freely available at https://github.com/bsml320/NetBCE.
Publications
No Publication Information
Credits
- Haodong Xu xuhaodong1992@qq.com Investigator
UTHealth School of Biomedical Informatics, UT Health Science Center at Houston, United States of America
Community Ratings
Usability | Efficiency | Reliability | Rated By |
---|---|---|---|
0 user | |||
Sign in to rate |
Accession | BT007321 |
---|---|
Tool Type | Framework |
Category | B-cell epitopes |
Platforms | Linux/Unix, Windows |
Technologies | Python3 |
User Interface | Terminal Command Line |
Input Data | FASTA |
Latest Release | 1.0 (August 17, 2022) |
Download Count | 1462 |
Country/Region | United States of America |
Submitted By | Haodong Xu |
This study was partially supported by National Institutes of Health grants (R01LM012806, R01DE030122, and R01DE029818). We thank the resource support from Cancer Prevention and Research Institute of Texas (CPRIT RP180734 and RP210045). Funding for open access charge: CPRIT (RP180734).