TransDFL A web server for identifying disordered flexible linkers in proteins

Introduction

Disordered flexible linkers (DFLs) are the functional disordered regions in proteins, which are the sub-regions of intrinsically disordered regions (IDRs) and play important roles in connecting domains and maintaining inter-domain interactions. Trained with the limited available DFLs, the existing DFL predictors based on the machine learning techniques tend to predict the ordered residues as DFLs leading to a high false-positive rate (FPR) and low prediction accuracy. Previous studies have shown that DFLs are the extremely flexible disordered regions, which are usually predicted as disordered residues with high confidence [P(D) > 0.9] by an IDR predictor. Therefore, transferring an IDR predictor to an accurate DFL predictor is of great significance for understanding the functions of IDRs. In this study, we proposed a new predictor called TransDFL for identifying DFLs by transferring the RFPR-IDP predictor for IDR identification to the DFL prediction. The RFPR-IDP was pre-trained with IDR sequences to learn the general features between IDRs and DFLs, which is helpful to reduce the false positives in the ordered regions. RFPR-IDP was fine-tuned with the DFL sequences to capture the specific features of DFLs so as to be transferred into the TransDFL. Experimental results of two application scenarios (prediction of DFLs only in the IDRs or prediction of DFLs in the entire proteins) showed that TransDFL consistently outperforms the other exiting DFL predictors with higher accuracy. The web server of TransDFL can be freely accessed from http://bliulab.net/TransDFL/.

 

We acknowledge with thanks the following software used in this server:

  • RFPR-IDP:
    • Liu Y, Wang X, Liu B. RFPR-IDP: reduce the false positive rates for intrinsically disordered protein and region prediction by incorporating both fully ordered proteins and disordered proteins. Brief Bioinform 2021;22:2000-11.
  • PSI-BLAST: download from https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.6.0/.
  • SPIDER
    • Yang Y, Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A, et al. SPIDER2: A Package to Predict Secondary Structure, Accessible Surface Area, and Main-Chain Torsional Angles by Deep Neural Networks. Methods Mol Biol 2017;1484:55-63.
    • Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A, Wang J, et al. Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning. Sci Rep 2015;5:11476.
  • PSI_PRED:
    • McGuffin LJ, Bryson K, Jones DT. The PSIPRED protein structure prediction server. Bioinformatics 2000;16:404-5.
  • SABLE:
    • Adamczak R, Porollo A, Meller J. Accurate prediction of solvent accessibility using neural networks–based regression. Proteins: Structure, Function, and Bioinformatics 2004;56:753-67.
    • Adamczak R, Porollo A, Meller J. Combining prediction of secondary structure and solvent accessibility in proteins. Proteins: Structure, Function, and Bioinformatics 2005;59:467-75.
    • Wagner M, Adamczak R, Porollo A, Meller J. Linear regression models for solvent accessibility prediction in proteins. J Comput Biol 2005;12:355-69.

Publications

No Publication Information

Credits

  1. Yihe Pang yhpang@bliulab.net
    Developer

    School of Computer Science and Technology, Beijing Institute of Technology, China

  2. Bin Liu bliu@bliulab.net
    Contributor

    School of Computer Science and Technology, Beijing Institute of Technology, China

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT007312
Tool TypeFramework
Category
Platforms
TechnologiesPython3
User InterfaceWebpage
Input DataFASTA
Latest Release0.1.0 (June 4, 2022)
Download Count1747
Submitted ByYihe Pang