STGRNS: an interpretable transformer-based method for inferring gene regulatory networks from single-cell transcriptomic data.

Jing Xu, Aidi Zhang, Fang Liu, Xiujun Zhang
Author Information
  1. Jing Xu: Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China. ORCID
  2. Aidi Zhang: Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China.
  3. Fang Liu: Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China.
  4. Xiujun Zhang: Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China. ORCID

Abstract

MOTIVATION: Single-cell RNA-sequencing (scRNA-seq) technologies provide an opportunity to infer cell-specific gene regulatory networks (GRNs), which is an important challenge in systems biology. Although numerous methods have been developed for inferring GRNs from scRNA-seq data, it is still a challenge to deal with cellular heterogeneity.
RESULTS: To address this challenge, we developed an interpretable transformer-based method namely STGRNS for inferring GRNs from scRNA-seq data. In this algorithm, gene expression motif technique was proposed to convert gene pairs into contiguous sub-vectors, which can be used as input for the transformer encoder. By avoiding missing phase-specific regulations in a network, gene expression motif can improve the accuracy of GRN inference for different types of scRNA-seq data. To assess the performance of STGRNS, we implemented the comparative experiments with some popular methods on extensive benchmark datasets including 21 static and 27 time-series scRNA-seq dataset. All the results show that STGRNS is superior to other comparative methods. In addition, STGRNS was also proved to be more interpretable than "black box" deep learning methods, which are well-known for the difficulty to explain the predictions clearly.
AVAILABILITY AND IMPLEMENTATION: The source code and data are available at https://github.com/zhanglab-wbgcas/STGRNS.

References

  1. Brief Bioinform. 2021 Nov 5;22(6): [PMID: 34424948]
  2. Front Genet. 2021 Oct 08;12:697090 [PMID: 34691142]
  3. Cell Syst. 2017 Sep 27;5(3):251-267.e3 [PMID: 28957658]
  4. Nucleic Acids Res. 2015 Mar 11;43(5):e31 [PMID: 25539927]
  5. Biomed Pharmacother. 2023 Sep;165:115077 [PMID: 37393865]
  6. Brief Bioinform. 2022 Nov 19;23(6): [PMID: 36168811]
  7. PLoS One. 2010 Sep 28;5(9): [PMID: 20927193]
  8. Cancer Res. 2018 Oct 1;78(19):5538-5547 [PMID: 30275053]
  9. Genome Biol. 2020 Dec 10;21(1):300 [PMID: 33303016]
  10. Bioinformatics. 2022 Jun 27;38(13):3488-3489 [PMID: 35604082]
  11. Bioinformatics. 2022 Mar 4;38(6):1716-1723 [PMID: 34999771]
  12. Brief Bioinform. 2021 Sep 2;22(5): [PMID: 33834200]
  13. Bioinformatics. 2013 Jan 1;29(1):106-13 [PMID: 23080116]
  14. Bioinformatics. 2022 Jan 12;38(3):746-753 [PMID: 34664632]
  15. Nat Comput Sci. 2021 Jul;1(7):491-501 [PMID: 38217125]
  16. Nat Rev Mol Cell Biol. 2022 Jan;23(1):40-55 [PMID: 34518686]
  17. BMC Genomics. 2018 May 9;19(Suppl 2):84 [PMID: 29764360]
  18. Genome Biol. 2018 Feb 6;19(1):15 [PMID: 29409532]
  19. Database (Oxford). 2015 Sep 30;2015: [PMID: 26424082]
  20. EMBO Rep. 2018 Dec;19(12): [PMID: 30413482]
  21. Sci Rep. 2015 Jun 12;5:11432 [PMID: 26066708]
  22. Nat Commun. 2023 Jan 14;14(1):223 [PMID: 36641532]
  23. Nucleic Acids Res. 2018 Jan 4;46(D1):D794-D801 [PMID: 29126249]
  24. Genome Res. 2019 Aug;29(8):1363-1375 [PMID: 31340985]
  25. Proc Natl Acad Sci U S A. 2019 Dec 26;116(52):27151-27158 [PMID: 31822622]
  26. IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):1278-1288 [PMID: 35914052]
  27. Brief Bioinform. 2022 Sep 20;23(5): [PMID: 36070863]
  28. Nature. 2021 Aug;596(7873):583-589 [PMID: 34265844]
  29. Brief Bioinform. 2021 May 20;22(3): [PMID: 34020546]
  30. BMC Bioinformatics. 2022 May 6;23(1):165 [PMID: 35524190]
  31. Nucleic Acids Res. 2021 Jan 8;49(D1):D97-D103 [PMID: 33151298]
  32. Int J Mol Sci. 2023 Jan 30;24(3): [PMID: 36768917]
  33. Nature. 2012 Sep 6;489(7414):57-74 [PMID: 22955616]
  34. Bioinformatics. 2017 Mar 1;33(5):764-766 [PMID: 27993778]
  35. Annu Rev Immunol. 2021 Apr 26;39:583-609 [PMID: 33637019]
  36. Front Genet. 2019 Apr 05;10:317 [PMID: 31024627]
  37. Brief Bioinform. 2020 Jul 15;21(4):1196-1208 [PMID: 31271412]
  38. Nucleic Acids Res. 2019 Jan 8;47(D1):D607-D613 [PMID: 30476243]
  39. Bioinformatics. 2012 Jan 1;28(1):98-104 [PMID: 22088843]
  40. Brief Bioinform. 2021 Sep 2;22(5): [PMID: 33876191]
  41. Brief Bioinform. 2022 Mar 10;23(2): [PMID: 35062026]
  42. Mol Plant. 2022 Nov 7;15(11):1807-1824 [PMID: 36307979]
  43. Nat Methods. 2012 Jul 15;9(8):796-804 [PMID: 22796662]
  44. Bioinformatics. 2022 Sep 30;38(19):4522-4529 [PMID: 35961023]
  45. Precis Clin Med. 2022 Jan 31;5(1):pbac002 [PMID: 35821681]
  46. Nucleic Acids Res. 2019 Jun 20;47(11):e62 [PMID: 30864667]
  47. Cell Rep. 2022 Feb 8;38(6):110333 [PMID: 35139376]
  48. Bioinformatics. 2022 May 26;38(11):3011-3019 [PMID: 35451460]
  49. Mol Syst Biol. 2019 Jun 19;15(6):e8746 [PMID: 31217225]
  50. Nat Methods. 2020 Feb;17(2):147-154 [PMID: 31907445]
  51. Brief Funct Genomics. 2020 Jul 29;19(4):286-291 [PMID: 32232401]
  52. Database (Oxford). 2013 Jun 21;2013:bat045 [PMID: 23794736]
  53. Front Cell Dev Biol. 2014 Aug 19;2:38 [PMID: 25364745]
  54. Bioinformatics. 2019 Jun 1;35(12):2159-2161 [PMID: 30445495]

MeSH Term

Gene Regulatory Networks
Transcriptome
Gene Expression Profiling
Software
Algorithms
Single-Cell Analysis
Sequence Analysis, RNA

Word Cloud

Created with Highcharts 10.0.0scRNA-seqgenedatamethodsSTGRNSGRNschallengeinferringinterpretableregulatorynetworksdevelopedtransformer-basedmethodexpressionmotifcancomparativeMOTIVATION:Single-cellRNA-sequencingtechnologiesprovideopportunityinfercell-specificimportantsystemsbiologyAlthoughnumerousstilldealcellularheterogeneityRESULTS:addressnamelyalgorithmtechniqueproposedconvertpairscontiguoussub-vectorsusedinputtransformerencoderavoidingmissingphase-specificregulationsnetworkimproveaccuracyGRNinferencedifferenttypesassessperformanceimplementedexperimentspopularextensivebenchmarkdatasetsincluding21static27time-seriesdatasetresultsshowsuperioradditionalsoproved"blackbox"deeplearningwell-knowndifficultyexplainpredictionsclearlyAVAILABILITYANDIMPLEMENTATION:sourcecodeavailablehttps://githubcom/zhanglab-wbgcas/STGRNSSTGRNS:single-celltranscriptomic

Similar Articles

Cited By