CAEM-GBDT: a cancer subtype identifying method using multi-omics data and convolutional autoencoder network.

Jiquan Shen, Xuanhui Guo, Hanwen Bai, Junwei Luo
Author Information
  1. Jiquan Shen: School of Software, Henan Polytechnic University, Jiaozuo, China.
  2. Xuanhui Guo: School of Software, Henan Polytechnic University, Jiaozuo, China.
  3. Hanwen Bai: School of Software, Henan Polytechnic University, Jiaozuo, China.
  4. Junwei Luo: School of Software, Henan Polytechnic University, Jiaozuo, China.

Abstract

The identification of cancer subtypes plays a very important role in the field of medicine. Accurate identification of cancer subtypes is helpful for both cancer treatment and prognosis Currently, most methods for cancer subtype identification are based on single-omics data, such as gene expression data. However, multi-omics data can show various characteristics about cancer, which also can improve the accuracy of cancer subtype identification. Therefore, how to extract features from multi-omics data for cancer subtype identification is the main challenge currently faced by researchers. In this paper, we propose a cancer subtype identification method named CAEM-GBDT, which takes gene expression data, miRNA expression data, and DNA methylation data as input, and adopts convolutional autoencoder network to identify cancer subtypes. Through a convolutional encoder layer, the method performs feature extraction on the input data. Within the convolutional encoder layer, a convolutional self-attention module is embedded to recognize higher-level representations of the multi-omics data. The extracted high-level representations from the convolutional encoder are then concatenated with the input to the decoder. The GBDT (Gradient Boosting Decision Tree) is utilized for cancer subtype identification. In the experiments, we compare CAEM-GBDT with existing cancer subtype identifying methods. Experimental results demonstrate that the proposed CAEM-GBDT outperforms other methods. The source code is available from GitHub at https://github.com/gxh-1/CAEM-GBDT.git.

Keywords

References

  1. BMC Bioinformatics. 2019 Oct 28;20(1):527 [PMID: 31660856]
  2. BMC Genomics. 2015 Dec 01;16:1022 [PMID: 26626453]
  3. Front Genet. 2023 Jan 04;13:1032768 [PMID: 36685873]
  4. Cancer Cell. 2022 Oct 10;40(10):1095-1110 [PMID: 36220072]
  5. Cancers (Basel). 2022 Jan 07;14(2): [PMID: 35053458]
  6. Interdiscip Sci. 2023 Jun;15(2):171-188 [PMID: 36646843]
  7. Front Pharmacol. 2022 May 10;13:872785 [PMID: 35620297]
  8. Genes (Basel). 2021 Dec 27;13(1): [PMID: 35052405]
  9. Nature. 2012 Jul 18;487(7407):330-7 [PMID: 22810696]
  10. Curr Med Chem. 2022;29(5):822-836 [PMID: 34533438]
  11. Nature. 2012 Sep 27;489(7417):519-25 [PMID: 22960745]
  12. Brief Bioinform. 2023 Mar 19;24(2): [PMID: 36702755]
  13. BMC Bioinformatics. 2023 Apr 26;24(1):169 [PMID: 37101124]
  14. Front Pharmacol. 2022 Dec 21;13:1056605 [PMID: 36618933]
  15. BMC Bioinformatics. 2022 Oct 17;23(1):430 [PMID: 36253710]
  16. Biostatistics. 2018 Jan 1;19(1):71-86 [PMID: 28541380]
  17. Brief Bioinform. 2022 Jan 17;23(1): [PMID: 34791014]
  18. Natl Sci Rev. 2019 Jan;6(1):74-86 [PMID: 34691833]
  19. Brief Bioinform. 2022 Jan 17;23(1): [PMID: 34607358]
  20. Nature. 2012 Oct 4;490(7418):61-70 [PMID: 23000897]
  21. Nat Methods. 2014 Mar;11(3):333-7 [PMID: 24464287]
  22. Curr Med Chem. 2020;27(32):5340-5350 [PMID: 30381060]
  23. Bioinformatics. 2021 Aug 25;37(16):2231-2237 [PMID: 33599254]

Word Cloud

Created with Highcharts 10.0.0cancerdatasubtypeidentificationconvolutionalmulti-omicssubtypesmethodsexpressionmethodCAEM-GBDTinputencodergenecanautoencodernetworklayermodulerepresentationsidentifyingplaysimportantrolefieldmedicineAccuratehelpfultreatmentprognosisCurrentlybasedsingle-omicsHowevershowvariouscharacteristicsalsoimproveaccuracyThereforeextractfeaturesmainchallengecurrentlyfacedresearcherspaperproposenamedtakesmiRNADNAmethylationadoptsidentifyperformsfeatureextractionWithinself-attentionembeddedrecognizehigher-levelextractedhigh-levelconcatenateddecoderGBDTGradientBoostingDecisionTreeutilizedexperimentscompareexistingExperimentalresultsdemonstrateproposedoutperformssourcecodeavailableGitHubhttps://githubcom/gxh-1/CAEM-GBDTgitCAEM-GBDT:usingautoencodeblockattention

Similar Articles

Cited By

No available data.