GSA: Genome Sequence Archive.

Yanqing Wang, Fuhai Song, Junwei Zhu, Sisi Zhang, Yadong Yang, Tingting Chen, Bixia Tang, Lili Dong, Nan Ding, Qian Zhang, Zhouxian Bai, Xunong Dong, Huanxin Chen, Mingyuan Sun, Shuang Zhai, Yubin Sun, Lei Yu, Li Lan, Jingfa Xiao, Xiangdong Fang, Hongxing Lei, Zhang Zhang, Wenming Zhao
Author Information
  1. Yanqing Wang: BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  2. Fuhai Song: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
  3. Junwei Zhu: BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  4. Sisi Zhang: BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  5. Yadong Yang: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
  6. Tingting Chen: BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  7. Bixia Tang: BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
  8. Lili Dong: BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  9. Nan Ding: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  10. Qian Zhang: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  11. Zhouxian Bai: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
  12. Xunong Dong: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
  13. Huanxin Chen: BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  14. Mingyuan Sun: BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  15. Shuang Zhai: BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  16. Yubin Sun: BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  17. Lei Yu: BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  18. Li Lan: BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  19. Jingfa Xiao: BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China; Collaborative Innovation Center of Genetics and Development, Fudan University, Shanghai 200438, China.
  20. Xiangdong Fang: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China; Collaborative Innovation Center of Genetics and Development, Fudan University, Shanghai 200438, China. Electronic address: fangxd@big.ac.cn.
  21. Hongxing Lei: CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China; Center of Alzheimer's Disease, Beijing Institute for Brain Disorders, Beijing 100053, China. Electronic address: leihx@big.ac.cn.
  22. Zhang Zhang: BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China; Collaborative Innovation Center of Genetics and Development, Fudan University, Shanghai 200438, China. Electronic address: zhangzhang@big.ac.cn.
  23. Wenming Zhao: BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China; Collaborative Innovation Center of Genetics and Development, Fudan University, Shanghai 200438, China. Electronic address: zhaowm@big.ac.cn.

Abstract

With the rapid development of sequencing technologies towards higher throughput and lower cost, sequence data are generated at an unprecedentedly explosive rate. To provide an efficient and easy-to-use platform for managing huge sequence data, here we present Genome Sequence Archive (GSA; http://bigd.big.ac.cn/gsa or http://gsa.big.ac.cn), a data repository for archiving raw sequence data. In compliance with data standards and structures of the International Nucleotide Sequence Database Collaboration (INSDC), GSA adopts four data objects (BioProject, BioSample, Experiment, and Run) for data organization, accepts raw sequence reads produced by a variety of sequencing platforms, stores both sequence reads and metadata submitted from all over the world, and makes all these data publicly available to worldwide scientific communities. In the era of big data, GSA is not only an important complement to existing INSDC members by alleviating the increasing burdens of handling sequence data deluge, but also takes the significant responsibility for global big data archive and provides free unrestricted access to all publicly available data in support of research activities throughout the world.

Keywords

References

  1. Nucleic Acids Res. 2017 Jan 4;45(D1):D18-D24 [PMID: 27899658]
  2. Nat Genet. 2015 May;47(5):435-44 [PMID: 25807286]
  3. N Engl J Med. 2015 Feb 26;372(9):793-5 [PMID: 25635347]
  4. Nucleic Acids Res. 2016 Jan 4;44(D1):D7-19 [PMID: 26615191]
  5. Nucleic Acids Res. 2016 Jan 4;44(D1):D48-50 [PMID: 26657633]
  6. Nat Commun. 2015 Mar 06;6:5681 [PMID: 25743335]
  7. Genomics Proteomics Bioinformatics. 2016 Oct;14 (5):253-261 [PMID: 27744061]
  8. Nucleic Acids Res. 2015 Jan;43(Database issue):D777-83 [PMID: 25404132]
  9. Nucleic Acids Res. 2016 Jan 4;44(D1):D20-6 [PMID: 26673705]
  10. Nucleic Acids Res. 2016 Jan 4;44(D1):D51-7 [PMID: 26578571]

MeSH Term

Animals
Databases, Genetic
High-Throughput Nucleotide Sequencing
Humans
Information Storage and Retrieval
Plants
Sequence Analysis, DNA
User-Computer Interface

Links to CNCB-NGDC Resources

Database Commons: DBC001790 (GSA)

Word Cloud

Similar Articles

Cited By