SurfR: Riding the wave of RNA-seq data with a comprehensive bioconductor package to identify surface protein-coding genes.

Aurora Maurizio, Anna Sofia Tascini, Marco J Morelli
Author Information
  1. Aurora Maurizio: Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan 20132, Italy. ORCID
  2. Anna Sofia Tascini: Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan 20132, Italy. ORCID
  3. Marco J Morelli: Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan 20132, Italy. ORCID

Abstract

Motivation: Proteins at the cell surface connect signaling networks and largely determine a cell's capacity to communicate and interact with its environment. In particular, variations in transcriptomic profiles are often observed between healthy and diseased cells, leading to distinct sets of cell-surface proteins. For these reasons, cell-surface proteins may act as biomarkers for the detection of cells of interest in tissues or body fluids, are often the target of pharmaceutical agents, and hold significant promise in the clinical practice for diagnosis, prognosis, treatment development, and evaluation of therapy response. Therefore, implementing robust methods to identify condition-specific cell-surface proteins is of pivotal importance to advance biomedical research.
Results: We developed SurfR, an R/Bioconductor package providing a streamlined end-to-end workflow for computationally identifying surface protein-coding genes from expression data. Our user-friendly, comprehensive workflow performs systematic expression data retrieval from public databases, differential gene expression across conditions, integration of datasets, enrichment analysis, identification of targetable proteins on a condition of interest, and data visualization.
Availability and implementation: SurfR is released under GNU-GPL-v3.0 License. Source code, documentation, examples, and tutorials are available through Bioconductor (http://www.bioconductor.org/packages/SurfR). RMD notebooks with the use cases code described in the manuscript can be found on GitHub (https://github.com/auroramaurizio/SurfR_UseCases).

References

  1. PLoS Comput Biol. 2019 Mar 5;15(3):e1006701 [PMID: 30835723]
  2. Nucleic Acids Res. 2002 Jan 1;30(1):207-10 [PMID: 11752295]
  3. Proc Natl Acad Sci U S A. 2018 Nov 13;115(46):E10988-E10997 [PMID: 30373828]
  4. Cancer Commun (Lond). 2023 Apr;43(4):455-479 [PMID: 36919193]
  5. Glia. 2016 Dec;64(12):2133-2153 [PMID: 27545331]
  6. Ann Surg Oncol. 2015 Dec;22 Suppl 3:S1524-31 [PMID: 26228109]
  7. Hepatology. 2004 Jan;39(1):220-9 [PMID: 14752841]
  8. BMC Syst Biol. 2018 Mar 19;12(Suppl 2):17 [PMID: 29560830]
  9. Nucleic Acids Res. 2016 Jul 8;44(W1):W90-7 [PMID: 27141961]
  10. Sci Adv. 2020 Jul 22;6(30):eaba2619 [PMID: 32832661]
  11. Nat Commun. 2020 Jan 31;11(1):651 [PMID: 32005835]
  12. Bioinform Adv. 2023 Jun 13;3(1):vbad073 [PMID: 37359727]
  13. BMC Cancer. 2021 Jul 23;21(1):850 [PMID: 34301218]
  14. Proc Natl Acad Sci U S A. 2016 Mar 29;113(13):3603-8 [PMID: 26979953]
  15. J Clin Endocrinol Metab. 2012 Feb;97(2):482-8 [PMID: 22112810]
  16. Genome Biol. 2014;15(12):550 [PMID: 25516281]
  17. Sci Rep. 2019 Apr 8;9(1):5760 [PMID: 30962539]
  18. Nucleic Acids Res. 2024 Jan 5;52(D1):D1265-D1275 [PMID: 37953279]
  19. Nat Methods. 2015 May;12(5):453-7 [PMID: 25822800]
  20. Bioinformatics. 2020 Jun 1;36(11):3447-3456 [PMID: 32053146]
  21. BMC Bioinformatics. 2014 Mar 29;15:91 [PMID: 24678608]
  22. Science. 2022 May 13;376(6594):eabl4896 [PMID: 35549404]
  23. BMC Biol. 2009 Aug 13;7:50 [PMID: 19678920]
  24. Mol Syst Biol. 2016 Oct 20;12(10):883 [PMID: 27951527]
  25. Genome Biol. 2003;4(9):117 [PMID: 12952525]
  26. Nat Commun. 2019 Jul 5;10(1):2975 [PMID: 31278265]
  27. Oncogene. 2012 Mar 1;31(9):1117-29 [PMID: 21841825]
  28. Proc Natl Acad Sci U S A. 2009 Sep 29;106(39):16752-7 [PMID: 19805368]
  29. Mol Cancer. 2020 Aug 10;19(1):123 [PMID: 32772918]
  30. Biosci Rep. 2021 Dec 22;41(12): [PMID: 34750607]
  31. Nat Genet. 2013 Oct;45(10):1113-20 [PMID: 24071849]
  32. Hum Genomics Proteomics. 2009 Dec 08;2009: [PMID: 20948568]
  33. Nat Commun. 2018 Apr 10;9(1):1366 [PMID: 29636450]
  34. Biomed Res Int. 2020 Jan 23;2020:8283401 [PMID: 32047816]
  35. NAR Genom Bioinform. 2020 Sep;2(3):lqaa078 [PMID: 33015620]
  36. Diagnostics (Basel). 2020 Jan 24;10(2): [PMID: 31991631]

Word Cloud

Created with Highcharts 10.0.0proteinsdatasurfacecell-surfaceexpressionoftencellsinterestidentifySurfRpackageworkflowprotein-codinggenescomprehensivecodebioconductorMotivation:Proteinscellconnectsignalingnetworkslargelydeterminecell'scapacitycommunicateinteractenvironmentparticularvariationstranscriptomicprofilesobservedhealthydiseasedleadingdistinctsetsreasonsmayactbiomarkersdetectiontissuesbodyfluidstargetpharmaceuticalagentsholdsignificantpromiseclinicalpracticediagnosisprognosistreatmentdevelopmentevaluationtherapyresponseThereforeimplementingrobustmethodscondition-specificpivotalimportanceadvancebiomedicalresearchResults:developedR/Bioconductorprovidingstreamlinedend-to-endcomputationallyidentifyinguser-friendlyperformssystematicretrievalpublicdatabasesdifferentialgeneacrossconditionsintegrationdatasetsenrichmentanalysisidentificationtargetableconditionvisualizationAvailabilityimplementation:releasedGNU-GPL-v30LicenseSourcedocumentationexamplestutorialsavailableBioconductorhttp://wwworg/packages/SurfRRMDnotebooksusecasesdescribedmanuscriptcanfoundGitHubhttps://githubcom/auroramaurizio/SurfR_UseCasesSurfR:RidingwaveRNA-seq

Similar Articles

Cited By