Gene Expression Nebulas (GEN): a comprehensive data portal integrating transcriptomic profiles across multiple species at both bulk and single-cell levels.

Yuansheng Zhang, Dong Zou, Tongtong Zhu, Tianyi Xu, Ming Chen, Guangyi Niu, Wenting Zong, Rong Pan, Wei Jing, Jian Sang, Chang Liu, Yujia Xiong, Yubin Sun, Shuang Zhai, Huanxin Chen, Wenming Zhao, Jingfa Xiao, Yiming Bao, Lili Hao, Zhang Zhang
Author Information
  1. Yuansheng Zhang: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  2. Dong Zou: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  3. Tongtong Zhu: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  4. Tianyi Xu: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  5. Ming Chen: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  6. Guangyi Niu: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  7. Wenting Zong: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China. ORCID
  8. Rong Pan: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  9. Wei Jing: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  10. Jian Sang: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  11. Chang Liu: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  12. Yujia Xiong: Beijing Neurosurgical Institute, Capital Medical University, Beijing 100069, China.
  13. Yubin Sun: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  14. Shuang Zhai: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  15. Huanxin Chen: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  16. Wenming Zhao: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  17. Jingfa Xiao: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China. ORCID
  18. Yiming Bao: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
  19. Lili Hao: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China. ORCID
  20. Zhang Zhang: National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China. ORCID

Abstract

Transcriptomic profiling is critical to uncovering functional elements from transcriptional and post-transcriptional aspects. Here, we present Gene Expression Nebulas (GEN, https://ngdc.cncb.ac.cn/gen/), an open-access data portal integrating transcriptomic profiles under various biological contexts. GEN features a curated collection of high-quality bulk and single-cell RNA sequencing datasets by using standardized data processing pipelines and a structured curation model. Currently, GEN houses a large number of gene expression profiles from 323 datasets (157 bulk and 166 single-cell), covering 50 500 samples and 15 540 169 cells across 30 species, which are further categorized into six biological contexts. Moreover, GEN integrates a full range of transcriptomic profiles on expression, RNA editing and alternative splicing for 10 bulk datasets, providing opportunities for users to conduct integrative analysis at both transcriptional and post-transcriptional levels. In addition, GEN provides abundant gene annotations based on value-added curation of transcriptomic profiles and delivers online services for data analysis and visualization. Collectively, GEN presents a comprehensive collection of transcriptomic profiles across multiple species, thus serving as a fundamental resource for better understanding genetic regulatory architecture and functional mechanisms from tissues to cells.

References

  1. Nucleic Acids Res. 2012 Jan;40(Database issue):D54-6 [PMID: 22009675]
  2. Nucleic Acids Res. 2015 Apr 20;43(7):e47 [PMID: 25605792]
  3. Genome Biol. 2016 Apr 12;17:66 [PMID: 27072794]
  4. Genomics Proteomics Bioinformatics. 2021 Aug;19(4):578-583 [PMID: 34400360]
  5. Bioinformatics. 2009 Aug 15;25(16):2078-9 [PMID: 19505943]
  6. Mol Syst Biol. 2019 Jun 19;15(6):e8746 [PMID: 31217225]
  7. Cell. 2015 May 21;161(5):1187-1201 [PMID: 26000487]
  8. Nat Protoc. 2016 Sep;11(9):1650-67 [PMID: 27560171]
  9. Cell. 2018 Nov 29;175(6):1701-1715.e16 [PMID: 30449622]
  10. Microbiol Res. 2007;162(4):285-98 [PMID: 17707620]
  11. Nat Immunol. 2019 Feb;20(2):163-172 [PMID: 30643263]
  12. Bioinformatics. 2012 Aug 15;28(16):2184-5 [PMID: 22743226]
  13. Elife. 2017 Dec 05;6: [PMID: 29206104]
  14. Cell. 2015 May 21;161(5):1202-1214 [PMID: 26000488]
  15. Front Genet. 2020 Feb 05;11:19 [PMID: 32117438]
  16. Innovation (Camb). 2021 Jul 01;2(3):100141 [PMID: 34557778]
  17. Database (Oxford). 2010 Aug 05;2010:baq020 [PMID: 20689021]
  18. Science. 2017 Oct 6;358(6359):58-63 [PMID: 28983043]
  19. Nucleic Acids Res. 2012 Jan;40(Database issue):D38-42 [PMID: 22110025]
  20. Bioinformatics. 2020 Apr 1;36(7):2311-2313 [PMID: 31764967]
  21. Nucleic Acids Res. 2018 Jan 4;46(D1):D121-D126 [PMID: 29036693]
  22. Nucleic Acids Res. 2012 Jan;40(Database issue):D940-6 [PMID: 22080554]
  23. BMC Genomics. 2013 Sep 20;14:632 [PMID: 24053356]
  24. Haematologica. 2013 Oct;98(10):1487-9 [PMID: 24091925]
  25. Cancer Res. 2005 Jun 15;65(12):4993-7 [PMID: 15958538]
  26. Nucleic Acids Res. 2023 Jan 6;51(D1):D18-D28 [PMID: 36420893]
  27. Nat Methods. 2013 Nov;10(11):1096-8 [PMID: 24056875]
  28. Curr Opin Plant Biol. 2004 Feb;7(1):50-6 [PMID: 14732441]
  29. Nat Plants. 2017 May 08;3:17061 [PMID: 28481330]
  30. Cell. 2021 Apr 1;184(7):1895-1913.e19 [PMID: 33657410]
  31. Bioinformatics. 2018 Sep 1;34(17):i884-i890 [PMID: 30423086]
  32. Nucleic Acids Res. 2019 Jan 8;47(D1):D78-D83 [PMID: 30357418]
  33. Nature. 2018 Mar 22;555(7697):524-528 [PMID: 29539641]
  34. Mol Cell. 2019 Jan 3;73(1):130-142.e5 [PMID: 30472192]
  35. Nucleic Acids Res. 2021 Jan 8;49(D1):D82-D85 [PMID: 33175160]
  36. Nat Med. 2019 May;25(5):751-758 [PMID: 31011205]
  37. Nucleic Acids Res. 2021 Jan 8;49(D1):D884-D891 [PMID: 33137190]
  38. Cell. 2011 Jan 21;144(2):296-309 [PMID: 21241896]
  39. Nucleic Acids Res. 2013 Jan;41(Database issue):D991-5 [PMID: 23193258]
  40. Genome Res. 2019 Apr;29(4):697-709 [PMID: 30858345]
  41. Nat Rev Genet. 2019 Nov;20(11):631-656 [PMID: 31341269]
  42. Nat Genet. 2013 Jun;45(6):580-5 [PMID: 23715323]
  43. Nucleic Acids Res. 2019 Jan 8;47(D1):D8-D14 [PMID: 30365034]
  44. Genome Biol. 2018 Jun 19;19(1):78 [PMID: 29921301]
  45. BMC Bioinformatics. 2011 Aug 04;12:323 [PMID: 21816040]
  46. Genome Res. 2008 Sep;18(9):1509-17 [PMID: 18550803]
  47. Nat Rev Genet. 2009 Jan;10(1):57-63 [PMID: 19015660]
  48. Genomics Proteomics Bioinformatics. 2017 Feb;15(1):14-18 [PMID: 28387199]
  49. Front Plant Sci. 2014 Jul 30;5:367 [PMID: 25126091]
  50. Front Genet. 2019 Apr 05;10:317 [PMID: 31024627]
  51. Bioinformatics. 2005 Mar 1;21(5):650-9 [PMID: 15388519]
  52. PLoS One. 2010 Sep 28;5(9): [PMID: 20927193]
  53. Nucleic Acids Res. 2019 Jan 8;47(D1):D163-D169 [PMID: 30335176]
  54. Nucleic Acids Res. 2021 Jan 8;49(D1):D10-D17 [PMID: 33095870]
  55. Genome Res. 2016 Sep;26(9):1277-87 [PMID: 27365365]
  56. Genomics Proteomics Bioinformatics. 2020 Apr;18(2):161-172 [PMID: 32683045]
  57. Plant J. 2017 Feb;89(4):789-804 [PMID: 27862469]
  58. Bioinformatics. 2013 Jul 15;29(14):1813-4 [PMID: 23742983]
  59. Bioinformatics. 2013 Jan 1;29(1):15-21 [PMID: 23104886]
  60. Nucleic Acids Res. 2019 Jul 2;47(W1):W636-W641 [PMID: 30976793]
  61. Nat Biotechnol. 2014 Apr;32(4):381-386 [PMID: 24658644]
  62. Nucleic Acids Res. 2019 Jan 8;47(D1):D766-D773 [PMID: 30357393]
  63. Nat Immunol. 2008 Oct;9(10):1091-4 [PMID: 18800157]
  64. Nucleic Acids Res. 2021 Jan 8;49(D1):D1186-D1191 [PMID: 33170268]
  65. BMC Bioinformatics. 2019 Jan 15;20(1):28 [PMID: 30646844]
  66. Cell. 2019 Jun 13;177(7):1888-1902.e21 [PMID: 31178118]
  67. Genome Biol. 2021 Mar 9;22(1):77 [PMID: 33685485]
  68. Nature. 2019 Feb;566(7745):496-502 [PMID: 30787437]
  69. BMC Bioinformatics. 2008 Dec 29;9:559 [PMID: 19114008]
  70. Nucleic Acids Res. 2016 Jul 8;44(W1):W90-7 [PMID: 27141961]
  71. Cell Rep. 2019 Feb 5;26(6):1627-1640.e7 [PMID: 30726743]
  72. Brief Bioinform. 2018 Sep 28;19(5):803-810 [PMID: 28334140]
  73. Methods Mol Biol. 2014;1126:357-97 [PMID: 24549677]
  74. Nucleic Acids Res. 2019 Jan 8;47(D1):D330-D338 [PMID: 30395331]
  75. Nucleic Acids Res. 2020 Jan 8;48(D1):D77-D83 [PMID: 31665515]
  76. Nat Commun. 2017 Jan 16;8:14049 [PMID: 28091601]
  77. Proc Natl Acad Sci U S A. 2014 Dec 23;111(51):E5593-601 [PMID: 25480548]
  78. Nucleic Acids Res. 2017 Jan 4;45(D1):D750-D757 [PMID: 27587585]
  79. Nat Methods. 2019 Apr;16(4):307-310 [PMID: 30923373]

MeSH Term

Animals
Databases, Genetic
Gene Expression Profiling
Gene Expression Regulation
Humans
Molecular Sequence Annotation
Single-Cell Analysis
Transcriptome

Links to CNCB-NGDC Resources

Database Commons: DBC006013 (GEN)

Word Cloud

Similar Articles

Cited By