LABAMPsGCN: A framework for identifying lactic acid bacteria antimicrobial peptides based on graph convolutional neural network.

Tong-Jie Sun, He-Long Bu, Xin Yan, Zhi-Hong Sun, Mu-Su Zha, Gai-Fang Dong
Author Information
  1. Tong-Jie Sun: College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China.
  2. He-Long Bu: College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China.
  3. Xin Yan: College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China.
  4. Zhi-Hong Sun: College of Food Science and Engineering, Inner Mongolia Agricultural University, Hohhot, China.
  5. Mu-Su Zha: College of Food Science and Engineering, Inner Mongolia Agricultural University, Hohhot, China.
  6. Gai-Fang Dong: College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China.

Abstract

lactic acid bacteria antimicrobial peptides (LABAMPs) are a class of active polypeptide produced during the metabolic process of lactic acid bacteria, which can inhibit or kill pathogenic bacteria or spoilage bacteria in food. LABAMPs have broad application in important practical fields closely related to human beings, such as food production, efficient agricultural planting, and so on. However, screening for antimicrobial peptides by biological experiment researchers is time-consuming and laborious. Therefore, it is urgent to develop a model to predict LABAMPs. In this work, we design a graph convolutional neural network framework for identifying of LABAMPs. We build heterogeneous graph based on amino acids, tripeptide and their relationships and learn weights of a graph convolutional network (GCN). Our GCN iteratively completes the learning of embedded words and sequence weights in the graph under the supervision of inputting sequence labels. We applied 10-fold cross-validation experiment to two training datasets and acquired accuracy of 0.9163 and 0.9379 respectively. They are higher that of other machine learning and GNN algorithms. In an independent test dataset, accuracy of two datasets is 0.9130 and 0.9291, which are 1.08% and 1.57% higher than the best methods of other online webservers.

Keywords

References

  1. Infect Control Hosp Epidemiol. 2011 May;32(5):507-9 [PMID: 21515983]
  2. Bioinformatics. 2019 Jun 1;35(12):2009-2016 [PMID: 30418485]
  3. J Biochem. 2013 Jun;153(6):511-21 [PMID: 23625998]
  4. Child Adolesc Psychiatr Clin N Am. 2007 Oct;16(4):755-67, v [PMID: 17823054]
  5. Nucleic Acids Res. 2012 Jan;40(Database issue):D1108-12 [PMID: 22110032]
  6. Dev Comp Immunol. 2006;30(3):283-8 [PMID: 15963564]
  7. Nucleic Acids Res. 2006 Jan 1;34(Database issue):D192-4 [PMID: 16381843]
  8. Nucleic Acids Res. 2012 Jul;40(Web Server issue):W452-7 [PMID: 22689647]
  9. Nucleic Acids Res. 2009 Jan;37(Database issue):D963-8 [PMID: 18836196]
  10. Bioinformatics. 2018 Aug 15;34(16):2740-2747 [PMID: 29590297]
  11. Nucleic Acids Res. 2008 Jan;36(Database issue):D202-5 [PMID: 17998252]
  12. Bioinformatics. 2006 Jul 1;22(13):1658-9 [PMID: 16731699]
  13. Nucleic Acids Res. 2021 Jan 8;49(D1):D288-D297 [PMID: 33151284]
  14. Mol Cell. 2022 Jan 20;82(2):274-284 [PMID: 35063096]
  15. Crit Rev Food Sci Nutr. 2022;62(10):2741-2755 [PMID: 33377402]
  16. FEMS Microbiol Lett. 2014 Aug;357(1):63-8 [PMID: 24888447]
  17. BMC Microbiol. 2010 Jan 27;10:22 [PMID: 20105292]
  18. Nucleic Acids Res. 2022 Jan 7;50(D1):D488-D496 [PMID: 34390348]
  19. Spectrochim Acta A Mol Biomol Spectrosc. 2018 Nov 5;204:301-307 [PMID: 29945113]
  20. BMC Bioinformatics. 2019 Dec 2;20(Suppl 16):506 [PMID: 31787076]
  21. Talanta. 1998 Aug;46(4):679-88 [PMID: 18967192]
  22. Brief Bioinform. 2019 Jun 03;: [PMID: 31155657]
  23. PLoS One. 2017 Dec 29;12(12):e0188129 [PMID: 29287069]
  24. J Immunol Methods. 1993 Mar 15;160(1):81-8 [PMID: 7680699]
  25. PLoS Comput Biol. 2022 Feb 10;18(2):e1009703 [PMID: 35143480]
  26. Int J Neurosci. 2022 Oct 13;:1-11 [PMID: 36178032]
  27. Bioinformatics. 2007 May 1;23(9):1148-55 [PMID: 17341497]
  28. Appl Environ Microbiol. 1994 May;60(5):1414-20 [PMID: 8017928]
  29. Lab Chip. 2012 Nov 7;12(21):4257-62 [PMID: 22914859]
  30. J Chem Phys. 2021 Nov 28;155(20):204108 [PMID: 34852491]
  31. Chimia (Aarau). 2016 Dec 21;70(12):874-877 [PMID: 28661360]
  32. Curr Opin Syst Biol. 2017 Dec;6:7-13 [PMID: 32954057]
  33. Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489 [PMID: 33237286]
  34. Biochim Biophys Acta. 1988 Oct 20;944(3):451-64 [PMID: 2846063]
  35. IEEE Trans Vis Comput Graph. 2022 Jan;28(1):614-622 [PMID: 34587052]
  36. Brief Bioinform. 2022 Jan 17;23(1): [PMID: 34882225]
  37. Biotechnol J. 2020 Jun;15(6):e1900344 [PMID: 31995278]

Word Cloud

Created with Highcharts 10.0.0bacteriagraphacidantimicrobialpeptidesLABAMPsnetwork0lacticconvolutionalneurallearningfoodexperimentframeworkidentifyingbasedtripeptideweightsGCNsequencetwodatasetsaccuracyhigher1LacticclassactivepolypeptideproducedmetabolicprocesscaninhibitkillpathogenicspoilagebroadapplicationimportantpracticalfieldscloselyrelatedhumanbeingsproductionefficientagriculturalplantingHoweverscreeningbiologicalresearcherstime-consuminglaboriousThereforeurgentdevelopmodelpredictworkdesignbuildheterogeneousaminoacidsrelationshipslearniterativelycompletesembeddedwordssupervisioninputtinglabelsapplied10-foldcross-validationtrainingacquired91639379respectivelymachineGNNalgorithmsindependenttestdataset9130929108%57%bestmethodsonlinewebserversLABAMPsGCN:deepconvolutionwordembedding

Similar Articles

Cited By (3)