The Sequence Read Archive: a decade more of explosive growth.

Advanced Search

Kenneth Katz, Oleg Shutov, Richard Lapoint, Michael Kimelman, J Rodney Brister, Christopher O'Sullivan

Author Information

Kenneth Katz: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. ORCID
Oleg Shutov: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
Richard Lapoint: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
Michael Kimelman: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
J Rodney Brister: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
Christopher O'Sullivan: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.

PMID: 34850094 DOI: 10.1093/nar/gkab1053

The Sequence Read Archive (SRA, https://www.ncbi.nlm.nih.gov/sra/) stores raw sequencing data and alignment information to enhance reproducibility and facilitate new discoveries through data analysis. Here we note changes in storage designed to increase access and highlight analyses that augment metadata with taxonomic insight to help users select data. In addition, we present three unanticipated applications of taxonomic analysis.

Nucleic Acids Res. 2012 Jan;40(Database issue):D54-6 [PMID: 22009675]
Database (Oxford). 2020 Jan 1;2020: [PMID: 32761142]
PLoS One. 2013;8(3):e59190 [PMID: 23533605]
Nucleic Acids Res. 2012 Jan;40(Database issue):D136-43 [PMID: 22139910]
Sci Data. 2016 Mar 15;3:160018 [PMID: 26978244]
Nucleic Acids Res. 2022 Jan 7;50(D1):D20-D26 [PMID: 34850941]
Genome Biol. 2020 May 12;21(1):115 [PMID: 32398145]
Nat Biotechnol. 2015 Mar;33(3):240-3 [PMID: 25748910]
Mol Pathol. 2003 Feb;56(1):11-8 [PMID: 12560456]
Genome Biol. 2021 Sep 20;22(1):270 [PMID: 34544477]
Nucleic Acids Res. 2016 Jan 4;44(D1):D733-45 [PMID: 26553804]

Bacteria

Base Sequence

Databases, Genetic

High-Throughput Nucleotide Sequencing

Internet

Metadata

Phylogeny

Reproducibility of Results

SARS-CoV-2

Sequence Analysis, RNA

Software

Viruses

Journal Article Research Support, N.I.H., Intramural

OpenLB
Open Library of Bioscience