hdWGCNA identifies co-expression networks in high-dimensional transcriptomics data.

Samuel Morabito, Fairlie Reese, Negin Rahimzadeh, Emily Miyoshi, Vivek Swarup
Author Information
  1. Samuel Morabito: Mathematical, Computational, and Systems Biology (MCSB) Program, University of California, Irvine, Irvine, CA, USA.
  2. Fairlie Reese: Center for Complex Biological Systems (CCBS), University of California, Irvine, Irvine, CA, USA.
  3. Negin Rahimzadeh: Mathematical, Computational, and Systems Biology (MCSB) Program, University of California, Irvine, Irvine, CA, USA.
  4. Emily Miyoshi: Institute for Memory Impairments and Neurological Disorders (MIND), University of California, Irvine, Irvine, CA, USA.
  5. Vivek Swarup: Center for Complex Biological Systems (CCBS), University of California, Irvine, Irvine, CA, USA.

Abstract

Biological systems are immensely complex, organized into a multi-scale hierarchy of functional units based on tightly regulated interactions between distinct molecules, cells, organs, and organisms. While experimental methods enable transcriptome-wide measurements across millions of cells, popular bioinformatic tools do not support systems-level analysis. Here we present hdWGCNA, a comprehensive framework for analyzing co-expression networks in high-dimensional transcriptomics data such as single-cell and spatial RNA sequencing (RNA-seq). hdWGCNA provides functions for network inference, gene module identification, gene enrichment analysis, statistical tests, and data visualization. Beyond conventional single-cell RNA-seq, hdWGCNA is capable of performing isoform-level network analysis using long-read single-cell data. We showcase hdWGCNA using data from autism spectrum disorder and Alzheimer's disease brain samples, identifying disease-relevant co-expression network modules. hdWGCNA is directly compatible with Seurat, a widely used R package for single-cell and spatial transcriptomics analysis, and we demonstrate the scalability of hdWGCNA by analyzing a dataset containing nearly 1 million cells.

Keywords

References

  1. Nucleic Acids Res. 2010 Sep;38(17):e169 [PMID: 20660011]
  2. Science. 2020 Sep 11;369(6509):1318-1330 [PMID: 32913098]
  3. Curr Opin Cell Biol. 2015 Feb;32:121-30 [PMID: 25726916]
  4. Nat Commun. 2021 Sep 28;12(1):5692 [PMID: 34584091]
  5. Genome Biol. 2022 Jan 21;23(1):31 [PMID: 35063006]
  6. Am J Pathol. 2008 Mar;172(3):725-37 [PMID: 18276779]
  7. PLoS Comput Biol. 2008 Aug 15;4(8):e1000117 [PMID: 18704157]
  8. Nat Methods. 2022 Jan;19(1):41-50 [PMID: 34949812]
  9. Genome Biol. 2021 Oct 7;22(1):286 [PMID: 34620214]
  10. Nat Genet. 2022 Apr;54(4):412-436 [PMID: 35379992]
  11. Mol Cell. 2018 Sep 6;71(5):858-871.e8 [PMID: 30078726]
  12. Nature. 2012 Apr 04;485(7397):242-5 [PMID: 22495311]
  13. Nat Med. 2020 May;26(5):769-780 [PMID: 32284590]
  14. Nat Neurosci. 2021 Feb;24(2):276-287 [PMID: 33432193]
  15. Nat Methods. 2021 Nov;18(11):1333-1341 [PMID: 34725479]
  16. Nat Biotechnol. 2023 Mar 27;: [PMID: 36973557]
  17. Nat Genet. 2021 Mar;53(3):403-411 [PMID: 33633365]
  18. Cell. 2017 Jun 15;169(7):1276-1290.e17 [PMID: 28602351]
  19. Genome Biol. 2022 Apr 19;23(1):100 [PMID: 35440087]
  20. Nat Rev Genet. 2022 Dec;23(12):741-759 [PMID: 35859028]
  21. BMC Bioinformatics. 2013 Apr 15;14:128 [PMID: 23586463]
  22. Nat Methods. 2022 May;19(5):534-546 [PMID: 35273392]
  23. Nat Genet. 2019 Mar;51(3):431-444 [PMID: 30804558]
  24. PLoS Comput Biol. 2011 Jan 20;7(1):e1001057 [PMID: 21283776]
  25. Nat Med. 2020 Jan;26(1):131-142 [PMID: 31932797]
  26. BMC Bioinformatics. 2007 Jan 24;8:22 [PMID: 17250769]
  27. Mol Cell Proteomics. 2020 Nov;19(11):1739-1748 [PMID: 32847821]
  28. Psychiatr Genet. 2016 Apr;26(2):60-5 [PMID: 26555645]
  29. Cell Rep. 2019 Apr 23;27(4):1293-1306.e6 [PMID: 31018141]
  30. Nature. 2019 Jun;570(7761):332-337 [PMID: 31042697]
  31. Comput Struct Biotechnol J. 2021 Jun 30;19:3796-3798 [PMID: 34285779]
  32. Nat Neurosci. 2018 Jun;21(6):811-819 [PMID: 29802388]
  33. Genome Biol. 2018 Feb 6;19(1):15 [PMID: 29409532]
  34. Genome Biol. 2020 Feb 7;21(1):30 [PMID: 32033565]
  35. Nat Methods. 2019 Dec;16(12):1289-1296 [PMID: 31740819]
  36. Cell. 2021 Jun 24;184(13):3573-3587.e29 [PMID: 34062119]
  37. Nat Commun. 2021 Jan 19;12(1):463 [PMID: 33469025]
  38. Oncotarget. 2016 Jan 5;7(1):593-609 [PMID: 26573230]
  39. Nat Neurosci. 2021 Apr;24(4):584-594 [PMID: 33723434]
  40. Nat Methods. 2017 Nov;14(11):1083-1086 [PMID: 28991892]
  41. Bioinformatics. 2021 Jun 9;37(9):1322-1323 [PMID: 32991665]
  42. BMC Syst Biol. 2007 Jun 04;1:24 [PMID: 17547772]
  43. Nat Commun. 2021 Feb 17;12(1):1088 [PMID: 33597522]
  44. Acta Neuropathol. 2020 Oct;140(4):477-493 [PMID: 32840654]
  45. Nature. 2020 Jul;583(7818):699-710 [PMID: 32728249]
  46. Cell Rep. 2022 Jun 14;39(11):110961 [PMID: 35705056]
  47. Nat Genet. 2021 Aug;53(8):1143-1155 [PMID: 34239132]
  48. Nat Biotechnol. 2018 Jun;36(5):411-420 [PMID: 29608179]
  49. Genome Biol. 2019 Oct 11;20(1):206 [PMID: 31604482]
  50. Nat Rev Genet. 2022 Nov;23(11):697-710 [PMID: 35821097]
  51. Sci Data. 2018 Sep 11;5:180185 [PMID: 30204156]
  52. Nat Genet. 2021 Mar;53(3):392-402 [PMID: 33589840]
  53. Cell Syst. 2019 Apr 24;8(4):281-291.e9 [PMID: 30954476]
  54. Cell Rep. 2020 Jul 14;32(2):107908 [PMID: 32668255]
  55. Bioinformatics. 2017 Feb 15;33(4):612-614 [PMID: 27993773]
  56. Nat Genet. 2022 Oct;54(10):1572-1580 [PMID: 36050550]
  57. Bioinformatics. 2008 Mar 1;24(5):719-20 [PMID: 18024473]
  58. Nat Biotechnol. 2022 Jul;40(7):1082-1092 [PMID: 35256815]
  59. Nat Methods. 2019 May;16(5):381-386 [PMID: 30962620]
  60. Sci Data. 2016 Oct 11;3:160089 [PMID: 27727239]
  61. Nat Biotechnol. 2014 Apr;32(4):381-386 [PMID: 24658644]
  62. Acta Neuropathol Commun. 2014 Feb 14;2:21 [PMID: 24528486]
  63. Cell. 2015 Sep 24;163(1):55-67 [PMID: 26406371]
  64. Stat Appl Genet Mol Biol. 2005;4:Article17 [PMID: 16646834]
  65. Cell Rep. 2020 Jun 23;31(12):107807 [PMID: 32579933]
  66. Nat Neurosci. 2020 Jun;23(6):771-781 [PMID: 32341540]
  67. Sci Rep. 2019 Mar 26;9(1):5233 [PMID: 30914743]
  68. Nat Genet. 2019 Mar;51(3):414-430 [PMID: 30820047]
  69. Proc Natl Acad Sci U S A. 2021 Nov 23;118(47): [PMID: 34795060]
  70. Cell. 2019 Jun 13;177(7):1888-1902.e21 [PMID: 31178118]
  71. Nature. 2019 Feb;566(7745):496-502 [PMID: 30787437]
  72. BMC Bioinformatics. 2008 Dec 29;9:559 [PMID: 19114008]
  73. Nat Genet. 2013 Jul;45(7):825-30 [PMID: 23708187]
  74. Nat Genet. 2019 Mar;51(3):404-413 [PMID: 30617256]
  75. Cell. 2018 Apr 5;173(2):291-304.e6 [PMID: 29625048]
  76. Science. 2019 May 17;364(6441):685-689 [PMID: 31097668]
  77. Nat Biotechnol. 2021 Jul;39(7):813-818 [PMID: 33795888]
  78. Hum Mol Genet. 2020 Oct 10;29(17):2899-2919 [PMID: 32803238]
  79. Acta Neuropathol. 2021 May;141(5):681-696 [PMID: 33609158]

Grants

  1. U01 DA053826/NIDA NIH HHS
  2. RF1 AG071683/NIA NIH HHS
  3. P01 NS084974/NINDS NIH HHS
  4. U19 AG068054/NIA NIH HHS
  5. U54 AG054349/NIA NIH HHS
  6. R01 AG071683/NIA NIH HHS

MeSH Term

Humans
Transcriptome
Autism Spectrum Disorder
Gene Expression Profiling
Gene Regulatory Networks
Alzheimer Disease

Word Cloud

Created with Highcharts 10.0.0hdWGCNAsingle-celldatanetworkanalysisco-expressiontranscriptomicsRNA-seqcellsspatialgeneanalyzingnetworkshigh-dimensionalusinglong-readspectrumdisorderAlzheimer'sdiseaseBiologicalsystemsimmenselycomplexorganizedmulti-scalehierarchyfunctionalunitsbasedtightlyregulatedinteractionsdistinctmoleculesorgansorganismsexperimentalmethodsenabletranscriptome-widemeasurementsacrossmillionspopularbioinformatictoolssupportsystems-levelpresentcomprehensiveframeworkRNAsequencingprovidesfunctionsinferencemoduleidentificationenrichmentstatisticaltestsvisualizationBeyondconventionalcapableperformingisoform-levelshowcaseautismbrainsamplesidentifyingdisease-relevantmodulesdirectlycompatibleSeuratwidelyusedRpackagedemonstratescalabilitydatasetcontainingnearly1millionidentifiesAutismmicrogliagenomics

Similar Articles

Cited By