RNADiffFold: generative RNA secondary structure prediction using discrete diffusion models.

Zhen Wang, Yizhen Feng, Qingwen Tian, Ziqi Liu, Pengju Yan, Xiaolin Li
Author Information
  1. Zhen Wang: Hangzhou Institute of Medicine, Chinese Academy of Sciences, Hangzhou 310018, Zhejiang, China.
  2. Yizhen Feng: Hangzhou Institute of Medicine, Chinese Academy of Sciences, Hangzhou 310018, Zhejiang, China.
  3. Qingwen Tian: Hangzhou Institute of Medicine, Chinese Academy of Sciences, Hangzhou 310018, Zhejiang, China.
  4. Ziqi Liu: Hangzhou Institute of Medicine, Chinese Academy of Sciences, Hangzhou 310018, Zhejiang, China.
  5. Pengju Yan: Hangzhou Institute of Medicine, Chinese Academy of Sciences, Hangzhou 310018, Zhejiang, China.
  6. Xiaolin Li: Hangzhou Institute of Medicine, Chinese Academy of Sciences, Hangzhou 310018, Zhejiang, China.

Abstract

Ribonucleic acid (RNA) molecules are essential macromolecules that perform diverse biological functions in living beings. Precise prediction of RNA secondary structures is instrumental in deciphering their complex three-dimensional architecture and functionality. Traditional methodologies for RNA structure prediction, including energy-based and learning-based approaches, often depict RNA secondary structures from a static perspective and rely on stringent a priori constraints. Inspired by the success of diffusion models, in this work, we introduce RNADiffFold, an innovative generative prediction approach of RNA secondary structures based on multinomial diffusion. We reconceptualize the prediction of contact maps as akin to pixel-wise segmentation and accordingly train a denoising model to refine the contact maps starting from a noise-infused state progressively. We also devise a potent conditioning mechanism that harnesses features extracted from RNA sequences to steer the model toward generating an accurate secondary structure. These features encompass one-hot encoded sequences, probabilistic maps generated from a pre-trained scoring network, and embeddings and attention maps derived from RNA foundation model. Experimental results on both within- and cross-family datasets demonstrate RNADiffFold's competitive performance compared with current state-of-the-art methods. Additionally, RNADiffFold has shown a notable proficiency in capturing the dynamic aspects of RNA structures, a claim corroborated by its performance on datasets comprising multiple conformations.

Keywords

References

  1. RNA. 2016 Dec;22(12):1808-1818 [PMID: 27852924]
  2. Nat Commun. 2019 Nov 27;10(1):5407 [PMID: 31776342]
  3. Nat Rev Genet. 2009 Mar;10(3):155-9 [PMID: 19188922]
  4. Nat Methods. 2022 Oct;19(10):1193-1207 [PMID: 36203019]
  5. Nucleic Acids Res. 2017 Oct 13;45(18):10811-10823 [PMID: 28977401]
  6. Nucleic Acids Res. 2004 Jun 15;32(10):e84 [PMID: 15199176]
  7. Bioinformatics. 2021 Sep 9;37(17):2589-2600 [PMID: 33704363]
  8. Bioinformatics. 2022 Mar 4;38(6):1745-1748 [PMID: 34954795]
  9. Front Genet. 2019 May 22;10:467 [PMID: 31191603]
  10. Bioinformatics. 2013 Nov 15;29(22):2933-5 [PMID: 24008419]
  11. Nucleic Acids Res. 2020 Feb 28;48(4):1627-1651 [PMID: 31828325]
  12. Bioinformatics. 2012 Dec 1;28(23):3150-2 [PMID: 23060610]
  13. Methods Mol Biol. 2012;905:99-122 [PMID: 22736001]
  14. J Biotechnol. 2017 Nov 10;261:97-104 [PMID: 28690134]
  15. Nucleic Acids Res. 2024 Jan 11;52(1):e3 [PMID: 37941140]
  16. BMC Bioinformatics. 2010 Mar 15;11:129 [PMID: 20230624]
  17. Algorithms Mol Biol. 2011 Nov 24;6:26 [PMID: 22115189]
  18. Nucleic Acids Res. 2017 Nov 16;45(20):11570-11581 [PMID: 29036420]
  19. Nucleic Acids Res. 2015 Jan;43(Database issue):D130-7 [PMID: 25392425]
  20. Nucleic Acids Res. 2014 Dec 16;42(22):13939-48 [PMID: 25416799]
  21. Adv Drug Deliv Rev. 2015 Jun 29;87:3-14 [PMID: 25979468]
  22. Cold Spring Harb Perspect Biol. 2010 Dec;2(12):a003665 [PMID: 20685845]
  23. J Comput Biol. 2011 Nov;18(11):1525-42 [PMID: 22035327]
  24. Front Genet. 2019 Mar 04;10:143 [PMID: 30886627]
  25. Nucleic Acids Res. 2003 Jul 1;31(13):3406-15 [PMID: 12824337]
  26. Nat Methods. 2017 Jan;14(1):45-48 [PMID: 27819659]
  27. Nucleic Acids Res. 2018 Jan 4;46(D1):D335-D342 [PMID: 29112718]
  28. Nat Struct Mol Biol. 2017 Oct 5;24(10):791-799 [PMID: 28981077]
  29. Nucleic Acids Res. 1997 Sep 1;25(17):3389-402 [PMID: 9254694]
  30. Science. 1989 Apr 7;244(4900):48-52 [PMID: 2468181]
  31. Nucleic Acids Res. 2018 Jun 20;46(11):5381-5394 [PMID: 29746666]
  32. Bioinformatics. 2009 Aug 1;25(15):1974-5 [PMID: 19398448]
  33. Nucleic Acids Res. 2017 Jan 4;45(D1):D271-D281 [PMID: 27794042]
  34. Bioinformatics. 2006 Jul 15;22(14):e90-8 [PMID: 16873527]
  35. Nature. 2008 Mar 6;452(7183):51-5 [PMID: 18322526]
  36. Nat Commun. 2021 Feb 11;12(1):941 [PMID: 33574226]
  37. Nucleic Acids Res. 2022 Feb 22;50(3):e14 [PMID: 34792173]
  38. Bioinformatics. 2019 Jul 15;35(14):i295-i304 [PMID: 31510672]
  39. Methods Mol Biol. 2014;1097:275-90 [PMID: 24639164]

Grants

  1. 2022YFC3600902/National Key Research and Development Program of China

MeSH Term

Nucleic Acid Conformation
RNA
Computational Biology
Algorithms
Software
Models, Molecular
Diffusion

Chemicals

RNA

Word Cloud

Created with Highcharts 10.0.0RNApredictionsecondarystructuresstructurediffusionmapsmodelmodelsRNADiffFoldgenerativecontactfeaturessequencesdatasetsperformancediscreteRibonucleicacidmoleculesessentialmacromoleculesperformdiversebiologicalfunctionslivingbeingsPreciseinstrumentaldecipheringcomplexthree-dimensionalarchitecturefunctionalityTraditionalmethodologiesincludingenergy-basedlearning-basedapproachesoftendepictstaticperspectiverelystringentprioriconstraintsInspiredsuccessworkintroduceinnovativeapproachbasedmultinomialreconceptualizeakinpixel-wisesegmentationaccordinglytraindenoisingrefinestartingnoise-infusedstateprogressivelyalsodevisepotentconditioningmechanismharnessesextractedsteertowardgeneratingaccurateencompassone-hotencodedprobabilisticgeneratedpre-trainedscoringnetworkembeddingsattentionderivedfoundationExperimentalresultswithin-cross-familydemonstrateRNADiffFold'scompetitivecomparedcurrentstate-of-the-artmethodsAdditionallyshownnotableproficiencycapturingdynamicaspectsclaimcorroboratedcomprisingmultipleconformationsRNADiffFold:usingdeeplearning

Similar Articles

Cited By

No available data.