Multi-Peptide: Multimodality Leveraged Language-Graph Learning of Peptide Properties.

Srivathsan Badrinarayanan, Chakradhar Guntuboina, Parisa Mollaei, Amir Barati Farimani
Author Information
  1. Srivathsan Badrinarayanan: Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh 15213, Pennsylvania, United States. ORCID
  2. Chakradhar Guntuboina: Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh 15213, Pennsylvania, United States.
  3. Parisa Mollaei: Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh 15213, Pennsylvania, United States. ORCID
  4. Amir Barati Farimani: Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh 15213, Pennsylvania, United States. ORCID

Abstract

Peptides are crucial in biological processes and therapeutic applications. Given their importance, advancing our ability to predict peptide properties is essential. In this study, we introduce Multi-Peptide, an innovative approach that combines transformer-based language models with graph neural networks (GNNs) to predict peptide properties. We integrate PeptideBERT, a transformer model specifically designed for peptide property prediction, with a GNN encoder to capture both sequence-based and structural features. By employing a contrastive loss framework, Multi-Peptide aligns embeddings from both modalities into a shared latent space, thereby enhancing the transformer model's predictive accuracy. Evaluations on hemolysis and nonfouling data sets demonstrate Multi-Peptide's robustness, achieving state-of-the-art 88.057% accuracy in hemolysis prediction. This study highlights the potential of multimodal learning in bioinformatics, paving the way for accurate and reliable predictions in peptide-based research and applications.

References

  1. Comput Struct Biotechnol J. 2022 May 18;20:2564-2573 [PMID: 35685352]
  2. Drug Discov Today. 2015 Jan;20(1):122-8 [PMID: 25450771]
  3. Nucleic Acids Res. 2000 Jan 1;28(1):235-42 [PMID: 10592235]
  4. J Chem Inf Model. 2023 Apr 24;63(8):2296-2304 [PMID: 37036101]
  5. J Chem Inf Model. 2024 Feb 26;64(4):1134-1144 [PMID: 38340054]
  6. Protein Sci. 2022 Jun;31(6):e4353 [PMID: 35634782]
  7. J Phys Chem B. 2024 Dec 12;128(49):12030-12037 [PMID: 39586094]
  8. PLoS Comput Biol. 2022 Dec 1;18(12):e1010669 [PMID: 36454728]
  9. IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7112-7127 [PMID: 34232869]
  10. J Med Chem. 2014 Jun 26;57(12):4977-5010 [PMID: 24351051]
  11. Sci Rep. 2020 Jul 2;10(1):10869 [PMID: 32616760]
  12. J Chem Inf Model. 2023 Apr 24;63(8):2546-2553 [PMID: 37010950]
  13. J Chem Theory Comput. 2023 Nov 28;19(22):8472-8480 [PMID: 37933128]
  14. Adv Drug Deliv Rev. 2020;156:133-187 [PMID: 32871201]
  15. ACS Catal. 2023 Oct 13;13(21):13863-13895 [PMID: 37942269]
  16. FEMS Microbiol Lett. 2014 Aug;357(1):63-8 [PMID: 24888447]
  17. Brief Bioinform. 2023 Sep 22;24(6): [PMID: 37864295]
  18. IEEE Trans Neural Netw. 2009 Jan;20(1):61-80 [PMID: 19068426]
  19. J Phys Chem Lett. 2023 Nov 23;14(46):10427-10434 [PMID: 37956397]
  20. Nature. 2021 Aug;596(7873):583-589 [PMID: 34265844]
  21. Nucleic Acids Res. 2022 Jan 7;50(D1):D439-D444 [PMID: 34791371]
  22. Adv Protein Chem. 1988;39:51-124 [PMID: 3072869]
  23. Sci Rep. 2022 Jun 23;12(1):10696 [PMID: 35739160]
  24. Eur J Biochem. 1977 Nov 1;80(2):319-24 [PMID: 923582]
  25. Acta Biomater. 2011 Apr;7(4):1550-7 [PMID: 21195214]
  26. Nat Commun. 2022 Jul 27;13(1):4348 [PMID: 35896542]
  27. Chem Sci. 2021 Jun 7;12(26):9221-9232 [PMID: 34349895]

MeSH Term

Peptides
Neural Networks, Computer
Hemolysis
Computational Biology
Machine Learning
Humans

Chemicals

Peptides

Word Cloud

Created with Highcharts 10.0.0peptideapplicationspredictpropertiesstudyMulti-PeptidetransformerpredictionaccuracyhemolysisPeptidescrucialbiologicalprocessestherapeuticGivenimportanceadvancingabilityessentialintroduceinnovativeapproachcombinestransformer-basedlanguagemodelsgraphneuralnetworksGNNsintegratePeptideBERTmodelspecificallydesignedpropertyGNNencodercapturesequence-basedstructuralfeaturesemployingcontrastivelossframeworkalignsembeddingsmodalitiessharedlatentspacetherebyenhancingmodel'spredictiveEvaluationsnonfoulingdatasetsdemonstrateMulti-Peptide'srobustnessachievingstate-of-the-art88057%highlightspotentialmultimodallearningbioinformaticspavingwayaccuratereliablepredictionspeptide-basedresearchMulti-Peptide:MultimodalityLeveragedLanguage-GraphLearningPeptideProperties

Similar Articles

Cited By