MAMS: matrix and analysis metadata standards to facilitate harmonization and reproducibility of single-cell data.

Irzam Sarfraz, Yichen Wang, Amulya Shastry, Wei Kheng Teh, Artem Sokolov, Brian R Herb, Heather H Creasy, Isaac Virshup, Ruben Dries, Kylee Degatano, Anup Mahurkar, Daniel J Schnell, Pedro Madrigal, Jason Hilton, Nils Gehlenborg, Timothy Tickle, Joshua D Campbell
Author Information
  1. Irzam Sarfraz: Department of Medicine, Boston University School of Medicine, Boston, MA, USA.
  2. Yichen Wang: Department of Medicine, Boston University School of Medicine, Boston, MA, USA.
  3. Amulya Shastry: Department of Medicine, Boston University School of Medicine, Boston, MA, USA.
  4. Wei Kheng Teh: European Bioinformatics Institute, European Molecular Biology Laboratory, Hinxton, Cambridgeshire, UK.
  5. Artem Sokolov: Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA.
  6. Brian R Herb: Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA.
  7. Heather H Creasy: Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA.
  8. Isaac Virshup: Department of Computational Health, Helmholtz Munich, Oberschleißheim, Germany.
  9. Ruben Dries: Department of Medicine, Boston University School of Medicine, Boston, MA, USA.
  10. Kylee Degatano: Data Sciences Platform, Broad Institute, Cambridge, MA, USA.
  11. Anup Mahurkar: Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA.
  12. Daniel J Schnell: Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA.
  13. Pedro Madrigal: European Bioinformatics Institute, European Molecular Biology Laboratory, Hinxton, Cambridgeshire, UK.
  14. Jason Hilton: Department of Genetics, Stanford University, Stanford, CA, USA.
  15. Nils Gehlenborg: Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
  16. Timothy Tickle: Data Sciences Platform, Broad Institute, Cambridge, MA, USA.
  17. Joshua D Campbell: Department of Medicine, Boston University School of Medicine, Boston, MA, USA. camp@bu.edu. ORCID

Abstract

Many datasets are being produced by consortia that seek to characterize healthy and disease tissues at single-cell resolution. While biospecimen and experimental information is often captured, detailed metadata standards related to data matrices and analysis workflows are currently lacking. To address this, we develop the matrix and analysis metadata standards (MAMS) to serve as a resource for data centers, repositories, and tool developers. We define metadata fields for matrices and parameters commonly utilized in analytical workflows and developed the rmams package to extract MAMS from single-cell objects. Overall, MAMS promotes the harmonization, integration, and reproducibility of single-cell data across platforms.

References

  1. Nucleic Acids Res. 2021 Jan 8;49(D1):D1502-D1506 [PMID: 33211879]
  2. Nature. 2019 Oct;574(7777):187-192 [PMID: 31597973]
  3. Genome Biol. 2018 Feb 6;19(1):15 [PMID: 29409532]
  4. Sci Data. 2016 Mar 15;3:160018 [PMID: 26978244]
  5. Genome Biol. 2022 Feb 1;23(1):42 [PMID: 35105358]
  6. Nat Methods. 2020 Feb;17(2):137-145 [PMID: 31792435]
  7. Bioinformatics. 2021 Sep 29;37(18):3058-3060 [PMID: 33715007]
  8. Nat Biotechnol. 2018 Jun;36(5):411-420 [PMID: 29608179]
  9. Nat Rev Genet. 2019 May;20(5):257-272 [PMID: 30696980]
  10. Nat Biotechnol. 2017 Apr 11;35(4):316-319 [PMID: 28398311]
  11. Comput Struct Biotechnol J. 2021 Apr 27;19:2588-2596 [PMID: 34025945]
  12. Nat Biotechnol. 2015 May;33(5):495-502 [PMID: 25867923]
  13. Science. 2022 Mar 4;375(6584):eabk2432 [PMID: 35239393]
  14. Am J Physiol Lung Cell Mol Physiol. 2017 Nov 1;313(5):L733-L740 [PMID: 28798251]
  15. Elife. 2017 Dec 05;6: [PMID: 29206104]
  16. Am J Respir Cell Mol Biol. 2024 Feb;70(2):129-139 [PMID: 36413377]
  17. Methods Mol Biol. 2016;1418:93-110 [PMID: 27008011]
  18. Sci Rep. 2021 Nov 4;11(1):21680 [PMID: 34737383]
  19. Nat Methods. 2022 Mar;19(3):311-315 [PMID: 34824477]
  20. PLoS Biol. 2021 May 4;19(5):e3001077 [PMID: 33945522]
  21. Elife. 2021 Sep 07;10: [PMID: 34491200]
  22. Cell. 2020 Apr 16;181(2):236-249 [PMID: 32302568]
  23. Cancer Res. 2017 Nov 1;77(21):e39-e42 [PMID: 29092936]
  24. Cell. 2019 Jun 13;177(7):1888-1902.e21 [PMID: 31178118]
  25. F1000Res. 2021 Jan 18;10:33 [PMID: 34035898]

Grants

  1. U24HL148865/NHLBI NIH HHS
  2. U24 HL148865/NHLBI NIH HHS
  3. U2C CA233238/NCI NIH HHS
  4. /Wellcome Trust
  5. 1U2CCA233238-01/Cancer Moonshot
  6. 108437/Z/15/Z/Wellcome Trust

MeSH Term

Metadata
Single-Cell Analysis
Reproducibility of Results
Humans
Software

Word Cloud

Created with Highcharts 10.0.0single-cellmetadatadatastandardsanalysisMAMSmatricesworkflowsmatrixharmonizationreproducibilityManydatasetsproducedconsortiaseekcharacterizehealthydiseasetissuesresolutionbiospecimenexperimentalinformationoftencaptureddetailedrelatedcurrentlylackingaddressdevelopserveresourcecentersrepositoriestooldevelopersdefinefieldsparameterscommonlyutilizedanalyticaldevelopedrmamspackageextractobjectsOverallpromotesintegrationacrossplatformsMAMS:facilitate

Similar Articles

Cited By