NetBCE An Interpretable Deep Neural Network for Accurate Prediction of Linear B-Cell Epitopes


Identification of B-cell epitopes (BCEs) plays an essential role in the development of peptide vaccines, immuno-diagnostic reagents, and antibody design and production. In this work, we generated a large benchmark dataset comprising 126,779 experimentally-supported, linear epitope-containing regions in 3567 protein clusters from over 1.3 million B cell assays. Analysis of this curated dataset showed large pathogen diversity covering 176 different families. The accuracy in linear BCE prediction was found to strongly vary with different features, while all sequence and structural features were informative. To search more efficient and interpretive feature representations, a ten-layer deep learning framework for linear BCE prediction, namely NetBCE, was developed. NetBCE achieved high accuracy and robust performance with the average area under the curve (AUC) value of 0.846 in five-fold cross validation through automatically learning the informative classification features. NetBCE substantially outperformed the conventional machine learning algorithms and other tools, with an over 22.06% improvement of AUC value compared to other tools using an independent dataset. Through investigating the output of important network modules in NetBCE, epitopes and non-epitopes tended to present in distinct regions with efficient feature representation along the network layer hierarchy. The NetBCE is freely available at


No Publication Information


  1. Haodong Xu

    UTHealth School of Biomedical Informatics, UT Health Science Center at Houston, United States of America

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Tool TypeFramework
CategoryB-cell epitopes
PlatformsLinux/Unix, Windows
User InterfaceTerminal Command Line
Input DataFASTA
Latest Release1.0 (August 17, 2022)
Download Count1438
Country/RegionUnited States of America
Submitted ByHaodong Xu

This study was partially supported by National Institutes of Health grants (R01LM012806, R01DE030122, and R01DE029818). We thank the resource support from Cancer Prevention and Research Institute of Texas (CPRIT RP180734 and RP210045). Funding for open access charge: CPRIT (RP180734).