tidybulk: an R tidy framework for modular transcriptomic data analysis.

Stefano Mangiola, Ramyar Molania, Ruining Dong, Maria A Doyle, Anthony T Papenfuss
Author Information
  1. Stefano Mangiola: Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
  2. Ramyar Molania: Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
  3. Ruining Dong: Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
  4. Maria A Doyle: Peter MacCallum Cancer Centre, Melbourne, VIC, 3000, Australia.
  5. Anthony T Papenfuss: Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia. papenfuss@wehi.edu.au. ORCID

Abstract

Recently, efforts have been made toward the harmonization of transcriptomic data structures and workflows using the concept of data tidiness, to facilitate modularisation. We present tidybulk, a modular framework for bulk transcriptional analyses that introduces a tidy transcriptomic data structure paradigm and analysis grammar. Tidybulk covers a wide variety of analysis procedures and integrates a large ecosystem of publicly available analysis algorithms under a common framework. Tidybulk decreases coding burden, facilitates reproducibility, increases efficiency for expert users, lowers the learning curve for inexperienced users, and bridges transcriptional data analysis with the tidyverse. Tidybulk is available at R/Bioconductor bioconductor.org/packages/tidybulk .

References

  1. Bioinformatics. 2012 Mar 15;28(6):882-3 [PMID: 22257669]
  2. Front Immunol. 2018 Nov 01;9:2489 [PMID: 30455688]
  3. Nucleic Acids Res. 2015 Apr 20;43(7):e47 [PMID: 25605792]
  4. Immunity. 2019 Nov 19;51(5):949-965.e6 [PMID: 31653482]
  5. Nat Commun. 2019 Aug 26;10(1):3841 [PMID: 31451696]
  6. Ann Hum Genet. 2011 Jan;75(1):36-45 [PMID: 20384625]
  7. Genome Biol. 2010;11(3):R25 [PMID: 20196867]
  8. Bioinformatics. 2010 Jan 1;26(1):139-40 [PMID: 19910308]
  9. Genome Biol. 2014 Feb 03;15(2):R29 [PMID: 24485249]
  10. Cell. 2016 Dec 15;167(7):1897 [PMID: 27984737]
  11. Nat Immunol. 2020 Mar;21(3):274-286 [PMID: 32066947]
  12. Nat Methods. 2015 Feb;12(2):115-21 [PMID: 25633503]
  13. PLoS One. 2019 Jun 26;14(6):e0218674 [PMID: 31242243]
  14. Elife. 2017 Nov 13;6: [PMID: 29130882]
  15. Genome Biol. 2012 Aug 31;13(8):R77 [PMID: 22937822]
  16. J Immunol. 2019 Sep 15;203(6):1609-1618 [PMID: 31427444]
  17. PLoS Comput Biol. 2013;9(8):e1003118 [PMID: 23950696]
  18. Bioinformatics. 2014 Apr 1;30(7):923-30 [PMID: 24227677]
  19. Genome Res. 2011 Feb;21(2):193-202 [PMID: 20921232]
  20. Genome Biol. 2019 Jan 4;20(1):4 [PMID: 30609939]
  21. Integr Biol (Camb). 2017 Apr 18;9(4):328-338 [PMID: 28290581]
  22. OMICS. 2012 May;16(5):284-7 [PMID: 22455463]
  23. Bioinformatics. 2011 Jun 15;27(12):1739-40 [PMID: 21546393]
  24. Genome Biol. 2014;15(12):550 [PMID: 25516281]
  25. Data Brief. 2017 Jul 15;14:77-83 [PMID: 28795085]
  26. EBioMedicine. 2019 Jan;39:44-58 [PMID: 30528453]
  27. Nat Methods. 2015 May;12(5):453-7 [PMID: 25822800]
  28. Nat Biotechnol. 2018 Jun;36(5):411-420 [PMID: 29608179]
  29. Nature. 2012 Sep 6;489(7414):57-74 [PMID: 22955616]

MeSH Term

Algorithms
Computational Biology
Data Analysis
Ecosystem
Gene Expression Profiling
Genomics
Reproducibility of Results
Software
Transcriptome

Word Cloud

Created with Highcharts 10.0.0dataanalysistranscriptomicframeworkTidybulkmodulartranscriptionaltidyavailableusersRecentlyeffortsmadetowardharmonizationstructuresworkflowsusingconcepttidinessfacilitatemodularisationpresenttidybulkbulkanalysesintroducesstructureparadigmgrammarcoverswidevarietyproceduresintegrateslargeecosystempubliclyalgorithmscommondecreasescodingburdenfacilitatesreproducibilityincreasesefficiencyexpertlowerslearningcurveinexperiencedbridgestidyverseR/Bioconductorbioconductororg/packages/tidybulktidybulk:R

Similar Articles

Cited By