circRNA-binding protein site prediction based on multi-view deep learning, subspace learning and multi-view classifier.

Hui Li, Zhaohong Deng, Haitao Yang, Xiaoyong Pan, Zhisheng Wei, Hong-Bin Shen, Kup-Sze Choi, Lei Wang, Shitong Wang, Jing Wu
Author Information
  1. Hui Li: Jiangnan University, Wuxi, Jiangsu 214012, China.
  2. Zhaohong Deng: School of Artificial Intelligence and Computer Science of Jiangnan University, Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (LCNBI) and ZJLab, Wuxi, Jiangsu 214012, China. ORCID
  3. Haitao Yang: Jiangnan University, Wuxi, Jiangsu 214012, China.
  4. Xiaoyong Pan: Department of Automation of Shanghai Jiao Tong University, Wuxi, Jiangsu 214012, China. ORCID
  5. Zhisheng Wei: School of Biotechnology and Key Laboratory of Industrial Biotechnology Ministry in Jiangnan University, Wuxi, Jiangsu 214012, China.
  6. Hong-Bin Shen: Shanghai Jiao Tong University, Wuxi, Jiangsu 214012, China.
  7. Kup-Sze Choi: Hong Kong Polytechnic University, Wuxi, Jiangsu 214012, China.
  8. Lei Wang: School of Biotechnology and Key Laboratory of Industrial Biotechnology Ministry in Jiangnan University, Wuxi, Jiangsu 214012, China.
  9. Shitong Wang: School of Artificial Intelligence and Computer Science of Jiangnan University, Wuxi, Jiangsu 214012, China.
  10. Jing Wu: School of Biotechnology and Key Laboratory of Industrial Biotechnology Ministry in Jiangnan University, Wuxi, Jiangsu 214012, China.

Abstract

Circular RNAs (circRNAs) generally bind to RNA-binding proteins (RBPs) to play an important role in the regulation of autoimmune diseases. Thus, it is crucial to study the binding sites of RBPs on circRNAs. Although many methods, including traditional machine learning and deep learning, have been developed to predict the interactions between RNAs and RBPs, and most of them are focused on linear RNAs. At present, few studies have been done on the binding relationships between circRNAs and RBPs. Thus, in-depth research is urgently needed. In the existing circRNA-RBP binding site prediction methods, circRNA sequences are the main research subjects, but the relevant characteristics of circRNAs have not been fully exploited, such as the structure and composition information of circRNA sequences. Some methods have extracted different views to construct recognition models, but how to efficiently use the multi-view data to construct recognition models is still not well studied. Considering the above problems, this paper proposes a multi-view classification method called DMSK based on multi-view deep learning, subspace learning and multi-view classifier for the identification of circRNA-RBP interaction sites. In the DMSK method, first, we converted circRNA sequences into pseudo-amino acid sequences and pseudo-dipeptide components for extracting high-dimensional sequence features and component features of circRNAs, respectively. Then, the structure prediction method RNAfold was used to predict the secondary structure of the RNA sequences, and the sequence embedding model was used to extract the context-dependent features. Next, we fed the above four views' raw features to a hybrid network, which is composed of a convolutional neural network and a long short-term memory network, to obtain the deep features of circRNAs. Furthermore, we used view-weighted generalized canonical correlation analysis to extract four views' common features by subspace learning. Finally, the learned subspace common features and multi-view deep features were fed to train the downstream multi-view TSK fuzzy system to construct a fuzzy rule and fuzzy inference-based multi-view classifier. The trained classifier was used to predict the specific positions of the RBP binding sites on the circRNAs. The experiments show that the prediction performance of the proposed method DMSK has been improved compared with the existing methods. The code and dataset of this study are available at https://github.com/Rebecca3150/DMSK.

Keywords

MeSH Term

Binding Sites
Carrier Proteins
Computational Biology
Deep Learning
Humans
RNA, Circular

Chemicals

Carrier Proteins
RNA, Circular

Word Cloud

Created with Highcharts 10.0.0multi-viewlearningfeaturescircRNAsdeepbindingpredictionsequencesRBPsmethodsmethodsubspaceclassifierusedfuzzyRNAssitespredictcircRNA-RBPsitecircRNAstructureconstructDMSKnetworkThusstudyresearchexistingrecognitionmodelsbasedsequenceextractfedfourviews'commonTSKsystemCirculargenerallybindRNA-bindingproteinsplayimportantroleregulationautoimmunediseasescrucialAlthoughmanyincludingtraditionalmachinedevelopedinteractionsfocusedlinearpresentstudiesdonerelationshipsin-depthurgentlyneededmainsubjectsrelevantcharacteristicsfullyexploitedcompositioninformationextracteddifferentviewsefficientlyusedatastillwellstudiedConsideringproblemspaperproposesclassificationcalledidentificationinteractionfirstconvertedpseudo-aminoacidpseudo-dipeptidecomponentsextractinghigh-dimensionalcomponentrespectivelyRNAfoldsecondaryRNAembeddingmodelcontext-dependentNextrawhybridcomposedconvolutionalneurallongshort-termmemoryobtainFurthermoreview-weightedgeneralizedcanonicalcorrelationanalysisFinallylearnedtraindownstreamruleinference-basedtrainedspecificpositionsRBPexperimentsshowperformanceproposedimprovedcomparedcodedatasetavailablehttps://githubcom/Rebecca3150/DMSKcircRNA-bindingproteinWGCCAfeature

Similar Articles

Cited By