New generative methods for single-cell transcriptome data in bulk RNA sequence deconvolution.

Toui Nishikawa, Masatoshi Lee, Masataka Amau
Author Information
  1. Toui Nishikawa: Faculty of Medicine, Wakayama Medical University, 811-1 Kimiidera, Wakayama, 641-8509, Japan. toui.nskw@gmail.com.
  2. Masatoshi Lee: Faculty of Medicine, Wakayama Medical University, 811-1 Kimiidera, Wakayama, 641-8509, Japan.
  3. Masataka Amau: Faculty of Medicine, Kyoto University, Kyoto, Japan.

Abstract

Numerous methods for bulk RNA sequence deconvolution have been developed to identify cellular targets of diseases by understanding the composition of cell types in disease-related tissues. However, issues of heterogeneity in gene expression between subjects and the shortage of reference single-cell RNA sequence data remain to achieve accurate bulk deconvolution. In our study, we investigated whether a new data generative method named sc-CMGAN and benchmarking generative methods (Copula, CTGAN and TVAE) could solve these issues and improve the bulk deconvolutions. We also evaluated the robustness of sc-CMGAN using three deconvolution methods and four public datasets. In almost all conditions, the generative methods contributed to improved deconvolution. Notably, sc-CMGAN outperformed the benchmarking methods and demonstrated higher robustness. This study is the first to examine the impact of data augmentation on bulk deconvolution. The new generative method, sc-CMGAN, is expected to become one of the powerful tools for the preprocessing of bulk deconvolution.

Keywords

References

  1. Nature. 2018 Aug;560(7719):494-498 [PMID: 30089906]
  2. Cell Syst. 2016 Oct 26;3(4):346-360.e4 [PMID: 27667365]
  3. Nat Commun. 2020 Jan 9;11(1):166 [PMID: 31919373]
  4. Nat Commun. 2020 Apr 24;11(1):1971 [PMID: 32332754]
  5. Bioinformatics. 2010 Jan 1;26(1):139-40 [PMID: 19910308]
  6. Nat Commun. 2019 Jan 22;10(1):380 [PMID: 30670690]
  7. Nat Rev Cancer. 2012 Mar 15;12(4):298-306 [PMID: 22419253]
  8. Cell. 2017 Oct 5;171(2):321-330.e14 [PMID: 28965763]
  9. Bioinformatics. 2013 Apr 15;29(8):1083-5 [PMID: 23428642]
  10. Proc Natl Acad Sci U S A. 2014 Sep 23;111(38):13924-9 [PMID: 25201977]
  11. Nature. 2020 May;581(7808):303-309 [PMID: 32214235]
  12. Mol Cell. 2017 Feb 16;65(4):631-643.e4 [PMID: 28212749]
  13. Diabetologia. 1983 May;24(5):366-71 [PMID: 6347784]
  14. Nat Rev Genet. 2015 Mar;16(3):133-45 [PMID: 25628217]
  15. Nat Methods. 2015 May;12(5):453-7 [PMID: 25822800]
  16. Nat Methods. 2010 Apr;7(4):287-9 [PMID: 20208531]
  17. Brief Bioinform. 2021 Jan 18;22(1):416-427 [PMID: 31925417]
  18. Bioinformatics. 2001;17 Suppl 1:S279-87 [PMID: 11473019]
  19. Nat Commun. 2020 Nov 6;11(1):5650 [PMID: 33159064]
  20. Nat Commun. 2017 Jan 16;8:14049 [PMID: 28091601]

MeSH Term

Humans
Transcriptome
Gene Expression Profiling
Base Sequence
Sequence Analysis, RNA
Single-Cell Analysis

Word Cloud

Created with Highcharts 10.0.0deconvolutionmethodsbulkRNAsequencegenerativedatasc-CMGANissuessingle-cellstudynewmethodbenchmarkingrobustnessNumerousdevelopedidentifycellulartargetsdiseasesunderstandingcompositioncelltypesdisease-relatedtissuesHoweverheterogeneitygeneexpressionsubjectsshortagereferenceremainachieveaccurateinvestigatedwhethernamedCopulaCTGANTVAEsolveimprovedeconvolutionsalsoevaluatedusingthreefourpublicdatasetsalmostconditionscontributedimprovedNotablyoutperformeddemonstratedhigherfirstexamineimpactaugmentationexpectedbecomeonepowerfultoolspreprocessingNewtranscriptomeAugmentationBulkDeconvolutionGenerativeAISingle-cell

Similar Articles

Cited By