scFSNN: a feature selection method based on neural network for single-cell RNA-seq data.

Minjiao Peng, Baoqin Lin, Jun Zhang, Yan Zhou, Bingqing Lin
Author Information
  1. Minjiao Peng: School of Mathematical Sciences, Shenzhen University, Nanshan, Shenzhen, 518060, Guangdong, China.
  2. Baoqin Lin: Experimental Center, The First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, Guangdong, 510405, China.
  3. Jun Zhang: School of Mathematical Sciences, Shenzhen University, Nanshan, Shenzhen, 518060, Guangdong, China.
  4. Yan Zhou: School of Mathematical Sciences, Shenzhen University, Nanshan, Shenzhen, 518060, Guangdong, China.
  5. Bingqing Lin: School of Mathematical Sciences, Shenzhen University, Nanshan, Shenzhen, 518060, Guangdong, China. bqlin@szu.edu.cn.

Abstract

While single-cell RNA sequencing (scRNA-seq) allows researchers to analyze gene expression in individual cells, its unique characteristics like over-dispersion, zero-inflation, high gene-gene correlation, and large data volume with many features pose challenges for most existing feature selection methods. In this paper, we present a feature selection method based on neural network (scFSNN) to solve classification problem for the scRNA-seq data. scFSNN is an embedded method that can automatically select features (genes) during model training, control the false discovery rate of selected features and adaptively determine the number of features to be eliminated. Extensive simulation and real data studies demonstrate its excellent feature selection ability and predictive performance.

Keywords

References

  1. Genome Biol. 2017 Sep 12;18(1):174 [PMID: 28899397]
  2. Nat Commun. 2019 Jan 23;10(1):390 [PMID: 30674886]
  3. Nature. 2015 Sep 10;525(7568):251-5 [PMID: 26287467]
  4. Cell Syst. 2016 Oct 26;3(4):346-360.e4 [PMID: 27667365]
  5. Development. 2017 Oct 1;144(19):3625-3632 [PMID: 28851704]
  6. Science. 2017 Apr 21;356(6335): [PMID: 28428369]
  7. Genome Biol. 2019 Dec 23;20(1):296 [PMID: 31870423]
  8. BMC Genomics. 2022 Jul 12;23(1):504 [PMID: 35831808]
  9. Genome Biol. 2018 Mar 14;19(1):31 [PMID: 29540203]
  10. Cell. 2019 Jun 13;177(7):1888-1902.e21 [PMID: 31178118]
  11. Proc Natl Acad Sci U S A. 2020 Oct 13;117(41):25800-25809 [PMID: 32989152]
  12. Genome Biol. 2010;11(3):R25 [PMID: 20196867]
  13. Nat Commun. 2020 Jan 9;11(1):166 [PMID: 31919373]
  14. Immunity. 2020 Dec 15;53(6):1258-1271.e5 [PMID: 33296686]
  15. Quant Biol. 2018 Sep;6(3):195-209 [PMID: 31456901]
  16. Genome Biol. 2019 Dec 12;20(1):264 [PMID: 31829268]
  17. Genome Biol. 2021 May 25;22(1):163 [PMID: 34034771]
  18. Bioinformatics. 2018 Apr 15;34(8):1329-1335 [PMID: 29186294]
  19. Bioinformatics. 2020 Mar 1;36(5):1468-1475 [PMID: 31598633]
  20. Cell Rep. 2017 Mar 28;18(13):3227-3241 [PMID: 28355573]
  21. Cell. 2017 Oct 5;171(2):321-330.e14 [PMID: 28965763]
  22. Proc Mach Learn Res. 2021 Apr;130:10-18 [PMID: 36092461]
  23. Proc Natl Acad Sci U S A. 2003 Aug 5;100(16):9440-5 [PMID: 12883005]
  24. Bioinformatics. 2007 Oct 1;23(19):2507-17 [PMID: 17720704]
  25. Stat Sci. 2012;27(4): [PMID: 24174707]
  26. Bioinformatics. 2020 Mar 1;36(6):1779-1784 [PMID: 31647523]

Grants

  1. 12071305/National Natural Science Foundation of China
  2. 11701386/National Natural Science Foundation of China
  3. 2023A1515011399/Natural Science Foundation of Guangdong Province of China

MeSH Term

Single-Cell Gene Expression Analysis
Neural Networks, Computer
Computer Simulation
Single-Cell Analysis
Sequence Analysis, RNA
Gene Expression Profiling
Cluster Analysis

Word Cloud

Created with Highcharts 10.0.0selectiondatafeaturesfeaturemethodneuralnetworksingle-cellscRNA-seqbasedscFSNNcontrolRNAsequencingallowsresearchersanalyzegeneexpressionindividualcellsuniquecharacteristicslikeover-dispersionzero-inflationhighgene-genecorrelationlargevolumemanyposechallengesexistingmethodspaperpresentsolveclassificationproblemembeddedcanautomaticallyselectgenesmodeltrainingfalsediscoveryrateselectedadaptivelydeterminenumbereliminatedExtensivesimulationrealstudiesdemonstrateexcellentabilitypredictiveperformancescFSNN:RNA-seqDeepFDRFeature

Similar Articles

Cited By