CNSA: a data repository for archiving omics data.

Xueqin Guo, Fengzhen Chen, Fei Gao, Ling Li, Ke Liu, Lijin You, Cong Hua, Fan Yang, Wanliang Liu, Chunhua Peng, Lina Wang, Xiaoxia Yang, Feiyu Zhou, Jiawei Tong, Jia Cai, Zhiyong Li, Bo Wan, Lei Zhang, Tao Yang, Minwen Zhang, Linlin Yang, Yawen Yang, Wenjun Zeng, Bo Wang, Xiaofeng Wei, Xun Xu
Author Information
  1. Xueqin Guo: China National GeneBank, Shenzhen 518120, China.
  2. Fengzhen Chen: China National GeneBank, Shenzhen 518120, China.
  3. Fei Gao: China National GeneBank, Shenzhen 518120, China.
  4. Ling Li: China National GeneBank, Shenzhen 518120, China.
  5. Ke Liu: China National GeneBank, Shenzhen 518120, China.
  6. Lijin You: China National GeneBank, Shenzhen 518120, China.
  7. Cong Hua: China National GeneBank, Shenzhen 518120, China.
  8. Fan Yang: China National GeneBank, Shenzhen 518120, China.
  9. Wanliang Liu: China National GeneBank, Shenzhen 518120, China.
  10. Chunhua Peng: China National GeneBank, Shenzhen 518120, China.
  11. Lina Wang: China National GeneBank, Shenzhen 518120, China.
  12. Xiaoxia Yang: China National GeneBank, Shenzhen 518120, China.
  13. Feiyu Zhou: China National GeneBank, Shenzhen 518120, China.
  14. Jiawei Tong: China National GeneBank, Shenzhen 518120, China.
  15. Jia Cai: China National GeneBank, Shenzhen 518120, China.
  16. Zhiyong Li: China National GeneBank, Shenzhen 518120, China.
  17. Bo Wan: China National GeneBank, Shenzhen 518120, China.
  18. Lei Zhang: China National GeneBank, Shenzhen 518120, China.
  19. Tao Yang: China National GeneBank, Shenzhen 518120, China.
  20. Minwen Zhang: China National GeneBank, Shenzhen 518120, China.
  21. Linlin Yang: China National GeneBank, Shenzhen 518120, China.
  22. Yawen Yang: China National GeneBank, Shenzhen 518120, China.
  23. Wenjun Zeng: China National GeneBank, Shenzhen 518120, China.
  24. Bo Wang: China National GeneBank, Shenzhen 518120, China.
  25. Xiaofeng Wei: China National GeneBank, Shenzhen 518120, China.
  26. Xun Xu: China National GeneBank, Shenzhen 518120, China.

Abstract

With the application and development of high-throughput sequencing technology in life and health sciences, massive multi-omics data brings the problem of efficient management and utilization. Database development and biocuration are the prerequisites for the reuse of these big data. Here, relying on China National GeneBank (CNGB), we present CNGB Sequence Archive (CNSA) for archiving omics data, including raw sequencing data and its further analyzed results which are organized into six objects, namely Project, Sample, Experiment, Run, Assembly and Variation at present. Moreover, CNSA has created a correlation model of living samples, sample information and analytical data on some projects. Both living samples and analytical data are directly correlated with the sample information. From either one, information or data of the other two can be obtained, so that all data can be traced throughout the life cycle from the living sample to the sample information to the analytical data. Complying with the data standards commonly used in the life sciences, CNSA is committed to building a comprehensive and curated data repository for storing, managing and sharing of omics data. We will continue to improve the data standards and provide free access to open-data resources for worldwide scientific communities to support academic research and the bio-industry. Database URL: https://db.cngb.org/cnsa/.

References

  1. Nucleic Acids Res. 2018 Jan 4;46(D1):D30-D35 [PMID: 29040613]
  2. New Genet Soc. 2017 Sep 06;36(4):336-353 [PMID: 29238265]
  3. Plant J. 2020 Apr;102(2):222-229 [PMID: 31788877]
  4. Nucleic Acids Res. 2020 Jan 8;48(D1):D9-D16 [PMID: 31602479]
  5. Nat Biotechnol. 2019 Feb;37(2):179-185 [PMID: 30718868]
  6. Yi Chuan. 2019 Aug 20;41(8):761-772 [PMID: 31447427]
  7. Lancet Glob Health. 2020 Apr;8(4):e591-e602 [PMID: 32199125]
  8. Nucleic Acids Res. 2018 Jan 4;46(D1):D48-D51 [PMID: 29190397]
  9. Nucleic Acids Res. 2018 Jan 4;46(D1):D21-D29 [PMID: 29186510]
  10. Bioinformatics. 2018 Sep 1;34(17):i884-i890 [PMID: 30423086]
  11. Database (Oxford). 2016 Oct 2;2016: [PMID: 27694206]
  12. Nature. 2010 Apr 15;464(7291):993-8 [PMID: 20393554]
  13. J Comput Aided Mol Des. 2014 Oct;28(10):1035-41 [PMID: 25038897]
  14. Gigascience. 2019 Apr 1;8(4): [PMID: 30689836]
  15. Genet Test Mol Biomarkers. 2014 Jun;18(6):375-6 [PMID: 24896853]
  16. Contemp Oncol (Pozn). 2015;19(1A):A68-77 [PMID: 25691825]
  17. Nucleic Acids Res. 2018 Jan 4;46(D1):D14-D20 [PMID: 29036542]
  18. Nucleic Acids Res. 2020 Jan 8;48(D1):D24-D33 [PMID: 31702008]

MeSH Term

Big Data
Computational Biology
Data Curation
Database Management Systems
Databases, Genetic

Links to CNCB-NGDC Resources

Database Commons: DBC007299 (CNSA)

Word Cloud

Created with Highcharts 10.0.0datasampleinformationlifeCNSAomicslivinganalyticaldevelopmentsequencingsciencesDatabaseCNGBpresentarchivingsamplescanstandardsrepositoryapplicationhigh-throughputtechnologyhealthmassivemulti-omicsbringsproblemefficientmanagementutilizationbiocurationprerequisitesreusebigrelyingChinaNationalGeneBankSequenceArchiveincludingrawanalyzedresultsorganizedsixobjectsnamelyProjectSampleExperimentRunAssemblyVariationMoreovercreatedcorrelationmodelprojectsdirectlycorrelatedeitheronetwoobtainedtracedthroughoutcycleComplyingcommonlyusedcommittedbuildingcomprehensivecuratedstoringmanagingsharingwillcontinueimproveprovidefreeaccessopen-dataresourcesworldwidescientificcommunitiessupportacademicresearchbio-industryURL:https://dbcngborg/cnsa/CNSA:

Similar Articles

Cited By