Unveiling the molecular complexity of proliferative diabetic retinopathy through scRNA-seq, AlphaFold 2, and machine learning.

Jun Wang, Hongyan Sun, Lisha Mou, Ying Lu, Zijing Wu, Zuhui Pu, Ming-Ming Yang
Author Information
  1. Jun Wang: Department of Endocrinology, Shenzhen People's Hospital (The Second Clinical Medical College of Jinan University; The First Affiliated Hospital, Southern University of Science and Technology), Shenzhen, China.
  2. Hongyan Sun: Department of Ophthalmology, Shenzhen People's Hospital (The Second Clinical Medical College, Jinan University; The First Affiliated Hospital, Southern University of Science and Technology), Shenzhen, China.
  3. Lisha Mou: Imaging Department, Shenzhen Institute of Translational Medicine, The First Affiliated Hospital of Shenzhen University, Shenzhen Second People's Hospital, Shenzhen, China.
  4. Ying Lu: Imaging Department, Shenzhen Institute of Translational Medicine, The First Affiliated Hospital of Shenzhen University, Shenzhen Second People's Hospital, Shenzhen, China.
  5. Zijing Wu: Imaging Department, Shenzhen Institute of Translational Medicine, The First Affiliated Hospital of Shenzhen University, Shenzhen Second People's Hospital, Shenzhen, China.
  6. Zuhui Pu: Imaging Department, Shenzhen Institute of Translational Medicine, The First Affiliated Hospital of Shenzhen University, Shenzhen Second People's Hospital, Shenzhen, China.
  7. Ming-Ming Yang: Department of Ophthalmology, Shenzhen People's Hospital (The Second Clinical Medical College, Jinan University; The First Affiliated Hospital, Southern University of Science and Technology), Shenzhen, China.

Abstract

Background: Proliferative diabetic retinopathy (PDR), a major cause of blindness, is characterized by complex pathogenesis. This study integrates single-cell RNA sequencing (scRNA-seq), Non-negative Matrix Factorization (NMF), machine learning, and AlphaFold 2 methods to explore the molecular level of PDR.
Methods: We analyzed scRNA-seq data from PDR patients and healthy controls to identify distinct cellular subtypes and gene expression patterns. NMF was used to define specific transcriptional programs in PDR. The oxidative stress-related genes (ORGs) identified within Meta-Program 1 were utilized to construct a predictive model using twelve machine learning algorithms. Furthermore, we employed AlphaFold 2 for the prediction of protein structures, complementing this with molecular docking to validate the structural foundation of potential therapeutic targets. We also analyzed protein-protein interaction (PPI) networks and the interplay among key ORGs.
Results: Our scRNA-seq analysis revealed five major cell types and 14 subcell types in PDR patients, with significant differences in gene expression compared to those in controls. We identified three key meta-programs underscoring the role of microglia in the pathogenesis of PDR. Three critical ORGs (ALKBH1, PSIP1, and ATP13A2) were identified, with the best-performing predictive model demonstrating high accuracy (AUC of 0.989 in the training cohort and 0.833 in the validation cohort). Moreover, AlphaFold 2 predictions combined with molecular docking revealed that resveratrol has a strong affinity for ALKBH1, indicating its potential as a targeted therapeutic agent. PPI network analysis, revealed a complex network of interactions among the hub ORGs and other genes, suggesting a collective role in PDR pathogenesis.
Conclusion: This study provides insights into the cellular and molecular aspects of PDR, identifying potential biomarkers and therapeutic targets using advanced technological approaches.

Keywords

References

  1. Elife. 2019 Jul 08;8: [PMID: 31282856]
  2. J Neurosci Res. 2012 Dec;90(12):2306-16 [PMID: 22847264]
  3. Ophthalmol Ther. 2023 Apr;12(2):1173-1180 [PMID: 36752956]
  4. Curr Opin Ophthalmol. 2023 May 1;34(3):232-236 [PMID: 36866849]
  5. Cell. 2020 Sep 17;182(6):1623-1640.e34 [PMID: 32946783]
  6. Diabetes. 2022 Apr 1;71(4):762-773 [PMID: 35061025]
  7. Nucleic Acids Res. 2019 Jan 8;47(D1):D948-D954 [PMID: 30247620]
  8. Sci Rep. 2021 May 18;11(1):10494 [PMID: 34006945]
  9. Genome Res. 2003 Nov;13(11):2498-504 [PMID: 14597658]
  10. Cancer Res. 2009 Apr 1;69(7):3157-64 [PMID: 19293182]
  11. Front Genet. 2019 Apr 05;10:317 [PMID: 31024627]
  12. Sci Rep. 2019 Sep 13;9(1):13249 [PMID: 31519943]
  13. Nat Rev Clin Oncol. 2024 Jan;21(1):28-46 [PMID: 37907723]
  14. Exp Mol Med. 2018 Aug 7;50(8):1-14 [PMID: 30089861]
  15. Nucleic Acids Res. 2018 Jan 4;46(D1):D8-D13 [PMID: 29140470]
  16. Nature. 2021 Aug;596(7873):590-596 [PMID: 34293799]
  17. Antioxidants (Basel). 2022 Feb 11;11(2): [PMID: 35204246]
  18. Nucleic Acids Res. 2003 Jan 1;31(1):258-61 [PMID: 12519996]
  19. Int J Mol Sci. 2022 Jun 23;23(13): [PMID: 35806008]
  20. Invest Ophthalmol Vis Sci. 2019 Oct 1;60(13):4084-4096 [PMID: 31574534]
  21. Front Med (Lausanne). 2022 May 27;9:776855 [PMID: 35721061]
  22. Nucleic Acids Res. 2017 Jan 4;45(D1):D955-D963 [PMID: 27899599]
  23. Curr Protoc Bioinformatics. 2008 Dec;Chapter 8:Unit 8.14 [PMID: 19085980]
  24. Nat Methods. 2019 Dec;16(12):1289-1296 [PMID: 31740819]
  25. Cell. 2021 Jun 24;184(13):3573-3587.e29 [PMID: 34062119]
  26. Proc Natl Acad Sci U S A. 2020 Dec 8;117(49):31198-31207 [PMID: 33229544]
  27. Retin Cases Brief Rep. 2023 May 1;17(3):315-320 [PMID: 34310414]
  28. Mol Neurodegener. 2019 Jan 16;14(1):4 [PMID: 30651094]
  29. Ann Oncol. 2024 Jan;35(1):29-65 [PMID: 37879443]
  30. Nutr Diabetes. 2022 Oct 30;12(1):46 [PMID: 36309487]
  31. Int J Mol Sci. 2023 Jan 10;24(2): [PMID: 36674854]
  32. Nat Rev Immunol. 2018 Jan;18(1):35-45 [PMID: 28787399]

MeSH Term

Humans
Machine Learning
Diabetic Retinopathy
Molecular Docking Simulation
Single-Cell Analysis
Sequence Analysis, RNA
RNA-Seq
Protein Interaction Maps
Female
Male
Oxidative Stress
Case-Control Studies
Single-Cell Gene Expression Analysis

Word Cloud

Created with Highcharts 10.0.0PDRAlphaFold2molecularscRNA-seqmachinelearningORGsdiabeticretinopathypathogenesisNMFidentifiedpotentialtherapeuticPPIanalysisrevealedALKBH1majorcomplexstudysingle-cellanalyzedpatientscontrolscellulargeneexpressionoxidativegenespredictivemodelusingdockingtargetsamongkeytypesrole0cohortnetworkBackground:ProliferativecauseblindnesscharacterizedintegratesRNAsequencingNon-negativeMatrixFactorizationmethodsexplorelevelMethods:datahealthyidentifydistinctsubtypespatternsuseddefinespecifictranscriptionalprogramsstress-relatedwithinMeta-Program1utilizedconstructtwelvealgorithmsFurthermoreemployedpredictionproteinstructurescomplementingvalidatestructuralfoundationalsoprotein-proteininteractionnetworksinterplayResults:fivecell14subcellsignificantdifferencescomparedthreemeta-programsunderscoringmicrogliaThreecriticalPSIP1ATP13A2best-performingdemonstratinghighaccuracyAUC989training833validationMoreoverpredictionscombinedresveratrolstrongaffinityindicatingtargetedagentinteractionshubsuggestingcollectiveConclusion:providesinsightsaspectsidentifyingbiomarkersadvancedtechnologicalapproachesUnveilingcomplexityproliferativestress

Similar Articles

Cited By (3)