On the Long-term Archiving of Research Data.

Cyril Pernet, Claus Svarer, Ross Blair, John D Van Horn, Russell A Poldrack
Author Information
  1. Cyril Pernet: Neurobiology Research Unit, Rigshospitalet, Copenhagen, Denmark. wamcyril@gmail.com. ORCID
  2. Claus Svarer: Neurobiology Research Unit, Rigshospitalet, Copenhagen, Denmark. ORCID
  3. Ross Blair: Department of Psychology & Stanford Center for Reproducible Neuroscience, Stanford University, Stanford, CA, USA.
  4. John D Van Horn: Department of Psychology & School of Data Science, University of Virginia, Charlottesville, VA, USA. ORCID
  5. Russell A Poldrack: Department of Psychology & Stanford Center for Reproducible Neuroscience, Stanford University, Stanford, CA, USA. ORCID

Abstract

Accessing research data at any time is what FAIR (Findable Accessible Interoperable Reusable) data sharing aims to achieve at scale. Yet, we argue that it is not sustainable to keep accumulating and maintaining all datasets for rapid access, considering the monetary and ecological cost of maintaining repositories. Here, we address the issue of cold data storage: when to dispose of data for offline storage, how can this be done while maintaining FAIR principles and who should be responsible for cold archiving and long-term preservation.

Keywords

References

Altman, M. & Landau, R. (2020). Selecting Efficient and Reliable Preservation Strategies: Modelling Long-Term InformationIntegrity Using the Large-Scale HierarchicalDiscrete Event Simulation. International journal of digital curation, 15. https://doi.org/10.2218/ijdc.v15i1.727
Currie, A & Kilbride, W (2021) FAIR Forever? Accountabilities and Responsibilities in the Preservation of Research Data. International journal of digital curation, 16. https://doi.org/10.2218/ijdc.v16i1.768
Gorgolewski, K., Auer, T., Calhoun, V. et al. (2016) The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Scientific Data, 3, 160044. https://doi.org/10.1038/sdata.2016.44
Higins, S. (2008). The DCC Curation lifecycle model. International journal of digital curation, 3. https://doi.org/10.2218/ijdc.v3i1.48
Markiewicz, C. J., Gorgolewski, K. J., Feingold, F., Blair, R., Halchenko, Y. O., Miller, E., Hardcastle, N., Wexler, J., Esteban, O., Goncavles, M., Jwa, A., & Poldrack, R. (2021). The OpenNeuro resource for sharing of neuroscience data. eLife, 10, e71774. https://doi.org/10.7554/eLife.71774
Milham, M.P., Craddock, R.C., Son, J.J. et al. (2018). Assessment of the impact of shared brain imaging data on the scientific literature. Nature Communication, 9, 2818. https://doi.org/10.1038/s41467-018-04976-1
The National Science and Technology Council. (2022). Desirable Characteristics of Data Repositories for Federally Funded Research. https://doi.org/10.5479/10088/113528 [DOI: 10.5479/10088/113528]
Van Horn, J.D. & Gazzaniga, M.S. (2013). Why share data? Lessons learned from the fMRIDC. Neuroimage, 82. https://doi.org/10.1016/j.neuroimage.2012.11.010
Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18

MeSH Term

Information Dissemination
Information Storage and Retrieval

Word Cloud

Similar Articles

Cited By