Cross-attention PHV: Prediction of human and virus protein-protein interactions using cross-attention-based neural networks.

Sho Tsukiyama, Hiroyuki Kurata
Author Information
  1. Sho Tsukiyama: Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan.
  2. Hiroyuki Kurata: Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan.

Abstract

Viral infections represent a major health concern worldwide. The alarming rate at which SARS-CoV-2 spreads, for example, led to a worldwide pandemic. Viruses incorporate genetic material into the host genome to hijack host cell functions such as the cell cycle and apoptosis. In these viral processes, protein-protein interactions (PPIs) play critical roles. Therefore, the identification of PPIs between humans and viruses is crucial for understanding the infection mechanism and host immune responses to viral infections and for discovering effective drugs. Experimental methods including mass spectrometry-based proteomics and yeast two-hybrid assays are widely used to identify human-virus PPIs, but these experimental methods are time-consuming, expensive, and laborious. To overcome this problem, we developed a novel computational predictor, named cross-attention PHV, by implementing two key technologies of the cross-attention mechanism and a one-dimensional convolutional neural network (1D-CNN). The cross-attention mechanisms were very effective in enhancing prediction and generalization abilities. Application of 1D-CNN to the word2vec-generated feature matrices reduced computational costs, thus extending the allowable length of protein sequences to 9000 amino acid residues. Cross-attention PHV outperformed existing state-of-the-art models using a benchmark dataset and accurately predicted PPIs for unknown viruses. Cross-attention PHV also predicted human-SARS-CoV-2 PPIs with area under the curve values >0.95. The Cross-attention PHV web server and source codes are freely available at https://kurata35.bio.kyutech.ac.jp/Cross-attention_PHV/ and https://github.com/kuratahiroyuki/Cross-Attention_PHV, respectively.

Keywords

References

  1. Brief Bioinform. 2021 Nov 5;22(6): [PMID: 34160596]
  2. Brief Bioinform. 2022 Mar 10;23(2): [PMID: 35225328]
  3. Bioinformatics. 2016 Apr 15;32(8):1144-50 [PMID: 26677965]
  4. Bioinformatics. 2021 Jul 17;: [PMID: 34273146]
  5. Nature. 2012 Oct 25;490(7421):556-60 [PMID: 23023127]
  6. Bioinformatics. 2021 Feb 26;: [PMID: 33638635]
  7. Biosystems. 2017 Dec;162:24-34 [PMID: 28860070]
  8. Brief Bioinform. 2022 Jul 18;23(4): [PMID: 35772910]
  9. Bioinformatics. 2012 Dec 1;28(23):3150-2 [PMID: 23060610]
  10. Proc Natl Acad Sci U S A. 1992 Nov 15;89(22):10915-9 [PMID: 1438297]
  11. Front Chem. 2020 Jan 10;7:895 [PMID: 31998687]
  12. Life (Basel). 2021 Feb 26;11(3): [PMID: 33652685]
  13. BMC Bioinformatics. 2018 May 8;19(Suppl 4):62 [PMID: 29745830]
  14. Virology. 2007 Sep 1;365(2):435-45 [PMID: 17493653]
  15. Math Biosci Eng. 2020 Apr 15;17(4):3109-3129 [PMID: 32987519]
  16. Comput Struct Biotechnol J. 2019 Dec 26;18:153-161 [PMID: 31969974]
  17. J Mol Biol. 2018 Aug 17;430(17):2590-2611 [PMID: 29924965]
  18. Comput Struct Biotechnol J. 2022 Jun 15;20:3223-3233 [PMID: 35832624]
  19. Biomed J. 2020 Oct;43(5):438-450 [PMID: 33036956]
  20. BMC Genomics. 2018 Aug 13;19(Suppl 6):568 [PMID: 30367586]
  21. Nature. 2021 Aug;596(7873):583-589 [PMID: 34265844]
  22. Nucleic Acids Res. 2017 Jan 4;45(D1):D158-D169 [PMID: 27899622]
  23. PLoS One. 2011 Mar 08;6(3):e17186 [PMID: 21408152]
  24. J Healthc Eng. 2018 May 9;2018:1391265 [PMID: 29854357]
  25. PLoS Pathog. 2009 Nov;5(11):e1000621 [PMID: 19956678]
  26. Bioinformatics. 2021 Mar 02;: [PMID: 33682875]
  27. Brief Bioinform. 2021 Mar 22;22(2):832-844 [PMID: 33515030]
  28. Nucleic Acids Res. 2010 Jul;38(Web Server issue):W508-15 [PMID: 20513650]
  29. PLoS One. 2014 Nov 06;9(11):e112034 [PMID: 25375323]
  30. J Cell Biol. 2011 Dec 26;195(7):1071-82 [PMID: 22123832]
  31. Protein Sci. 2021 Jan;30(1):187-200 [PMID: 33070389]

Word Cloud

Created with Highcharts 10.0.0PPIsPHVneuralCross-attentionSARS-CoV-2hostinteractionscross-attentionnetwork1D-CNNinfectionsworldwidecellviralprotein-proteinvirusesmechanismeffectivemethodscomputationalusingpredictedcurvevirusConvolutionalWord2vecViralrepresentmajorhealthconcernalarmingratespreadsexampleledpandemicVirusesincorporategeneticmaterialgenomehijackfunctionscycleapoptosisprocessesplaycriticalrolesThereforeidentificationhumanscrucialunderstandinginfectionimmuneresponsesdiscoveringdrugsExperimentalincludingmassspectrometry-basedproteomicsyeasttwo-hybridassayswidelyusedidentifyhuman-virusexperimentaltime-consumingexpensivelaboriousovercomeproblemdevelopednovelpredictornamedimplementingtwokeytechnologiesone-dimensionalconvolutionalmechanismsenhancingpredictiongeneralizationabilitiesApplicationword2vec-generatedfeaturematricesreducedcoststhusextendingallowablelengthproteinsequences9000aminoacidresiduesoutperformedexistingstate-of-the-artmodelsbenchmarkdatasetaccuratelyunknownalsohuman-SARS-CoV-2areavalues>095webserversourcecodesfreelyavailablehttps://kurata35biokyutechacjp/Cross-attention_PHV/https://githubcom/kuratahiroyuki/Cross-Attention_PHVrespectivelyPHV:Predictionhumancross-attention-basednetworksOne-dimensional-CNNACAccuracyAUCAreaCNNDTDecisiontreeF1F1-scoreHV-PPIsHuman-virusHuV-PPIHuman–unknownPPIHumanLRLinearregressionMCCMatthewscorrelationcoefficientProtein-proteinProtein–proteininteractionRFRandomforestSevereacuterespiratorysyndromecoronavirus2SNSensitivitySPSpecificitySVMSupportvectormachineT-SNET-distributedstochasticneighborembeddingVirusW2V

Similar Articles

Cited By