ConoGPT: Fine-Tuning a Protein Language Model by Incorporating Disulfide Bond Information for Conotoxin Sequence Generation.

Guohui Zhao, Cheng Ge, Wenzheng Han, Rilei Yu, Hao Liu
Author Information
  1. Guohui Zhao: College of Computer Science and Technology, Ocean University of China, Songling Road, Qingdao 266100, China.
  2. Cheng Ge: School of Medicine and Pharmacy, Ocean University of China, Songling Road, Qingdao 266100, China. ORCID
  3. Wenzheng Han: College of Computer Science and Technology, Ocean University of China, Songling Road, Qingdao 266100, China.
  4. Rilei Yu: School of Medicine and Pharmacy, Ocean University of China, Songling Road, Qingdao 266100, China. ORCID
  5. Hao Liu: College of Computer Science and Technology, Ocean University of China, Songling Road, Qingdao 266100, China. ORCID

Abstract

Conotoxins are a class of peptide toxins secreted by marine mollusks of the Conus genus, characterized by their unique mechanism of action and significant biological activity, making them highly valuable for drug development. However, traditional methods of acquiring conotoxins, such as in vivo extraction or chemical synthesis, face challenges of high costs, long cycles, and limited exploration of sequence diversity. To address these issues, we propose the ConoGPT model, a conotoxin sequence generation model that fine-tunes the ProtGPT2 model by incorporating disulfide bond information. Experimental results demonstrate that sequences generated by ConoGPT exhibit high consistency with authentic conotoxins in physicochemical properties and show considerable potential for generating novel conotoxins. Furthermore, compared to models without disulfide bond information, ConoGPT outperforms in terms of generating sequences with ordered structures. The majority of the filtered sequences were shown to possess significant binding affinities to nicotinic acetylcholine receptor (nAChR) targets based on molecular docking. Molecular dynamics simulations of the selected sequences further confirmed the dynamic stability of the generated sequences in complex with their respective targets. This study not only provides a new technological approach for conotoxin design but also offers a novel strategy for generating functional peptides.

Keywords

References

  1. Brief Bioinform. 2023 May 19;24(3): [PMID: 37020337]
  2. J Biol Chem. 2006 Oct 20;281(42):31173-7 [PMID: 16905531]
  3. J Venom Anim Toxins Incl Trop Dis. 2022 May 18;28:e20210116 [PMID: 35677566]
  4. Front Bioinform. 2023 Jul 13;3:1216362 [PMID: 37521317]
  5. Commun Biol. 2021 Sep 9;4(1):1050 [PMID: 34504303]
  6. Nat Biotechnol. 2023 Aug;41(8):1099-1106 [PMID: 36702895]
  7. Toxicon. 2010 Jul;55(8):1491-509 [PMID: 20211197]
  8. Chemistry. 2024 Feb 1;30(7):e202302909 [PMID: 37910861]
  9. Curr Opin Struct Biol. 2022 Feb;72:226-236 [PMID: 34963082]
  10. J Chem Inf Model. 2024 May 27;64(10):4310-4321 [PMID: 38739853]
  11. Int J Pharm. 2015 Jan 30;478(2):753-61 [PMID: 25529432]
  12. IUBMB Life. 2004 Feb;56(2):89-93 [PMID: 15085932]
  13. Nat Commun. 2023 Mar 15;14(1):1453 [PMID: 36922490]
  14. Nat Commun. 2024 Aug 7;15(1):6699 [PMID: 39107330]
  15. Biochemistry. 2006 Oct 3;45(39):11713-26 [PMID: 17002272]
  16. Curr Opin Biotechnol. 2022 Jun;75:102718 [PMID: 35395425]
  17. Cell Mol Life Sci. 2005 Dec;62(24):3067-79 [PMID: 16314929]
  18. Nature. 2024 Jun;630(8016):493-500 [PMID: 38718835]
  19. Brief Bioinform. 2023 Mar 19;24(2): [PMID: 36857616]
  20. Nat Commun. 2023 Nov 8;14(1):7197 [PMID: 37938588]
  21. Mar Drugs. 2018 Oct 30;16(11): [PMID: 30380764]
  22. Expert Opin Pharmacother. 2013 May;14(7):957-66 [PMID: 23537340]
  23. Nat Biotechnol. 2008 Mar;26(3):274-5 [PMID: 18327232]
  24. Chem Rev. 2014 Jun 11;114(11):5815-47 [PMID: 24720541]
  25. Curr Opin Struct Biol. 2023 Apr;79:102527 [PMID: 36738564]
  26. Nat Commun. 2022 Jul 27;13(1):4348 [PMID: 35896542]
  27. Nucleic Acids Res. 2015 Jan;43(Database issue):D204-12 [PMID: 25348405]
  28. J Chem Inf Model. 2021 May 24;61(5):2198-2207 [PMID: 33787250]
  29. Nucleic Acids Res. 2012 Jan;40(Database issue):D325-30 [PMID: 22058133]

Grants

  1. 2023CXPT020/Key R&D Program of Shandong Province, China

MeSH Term

Conotoxins
Disulfides
Molecular Dynamics Simulation
Amino Acid Sequence
Receptors, Nicotinic
Animals
Molecular Docking Simulation
Conus Snail

Chemicals

Conotoxins
Disulfides
Receptors, Nicotinic

Word Cloud

Created with Highcharts 10.0.0sequencesconotoxinsmodelConoGPTgeneratingpeptidesignificantactivityhighsequenceconotoxingenerationdisulfidebondinformationgeneratednoveltargetsConotoxinsclasstoxinssecretedmarinemollusksConusgenuscharacterizeduniquemechanismactionbiologicalmakinghighlyvaluabledrugdevelopmentHowevertraditionalmethodsacquiringvivoextractionchemicalsynthesisfacechallengescostslongcycleslimitedexplorationdiversityaddressissuesproposefine-tunesProtGPT2incorporatingExperimentalresultsdemonstrateexhibitconsistencyauthenticphysicochemicalpropertiesshowconsiderablepotentialFurthermorecomparedmodelswithoutoutperformstermsorderedstructuresmajorityfilteredshownpossessbindingaffinitiesnicotinicacetylcholinereceptornAChRbasedmoleculardockingMoleculardynamicssimulationsselectedconfirmeddynamicstabilitycomplexrespectivestudyprovidesnewtechnologicalapproachdesignalsooffersstrategyfunctionalpeptidesConoGPT:Fine-TuningProteinLanguageModelIncorporatingDisulfideBondInformationConotoxinSequenceGenerationpredictionproteinlanguagereceptors

Similar Articles

Cited By