Effective feature selection framework for cluster analysis of microarray data.

Gouchol Pok, Jyh-Charn Steve Liu, Keun Ho Ryu
Author Information
  1. Gouchol Pok: Yanbian University of science and Technology, Dept. of Computer Science, Yanji, Jilin, China 133000.

Abstract

The microarray technique has become a standard means in simultaneously examining expression of all genes measured in different circumstances. As microarray data are typically characterized by high dimensional features with a small number of samples, feature selection needs to be incorporated to identify a subset of genes that are meaningful for biological interpretation and accountable for the sample variation. In this article, we present a simple, yet effective feature selection framework suitable for two-dimensional microarray data. Our correlation-based, nonparametric approach allows compact representation of class-specific properties with a small number of genes. We evaluated our method using publicly available experimental data and obtained favorable results.

Keywords

References

  1. Artif Intell Med. 2004 Jun;31(2):91-103 [PMID: 15219288]
  2. Bioinformatics. 2008 Jul 1;24(13):i86-95 [PMID: 18586749]
  3. Bioinformatics. 2005 Nov 1;21(21):3970-5 [PMID: 16244221]
  4. Bioinformatics. 2006 Oct 1;22(19):2430-6 [PMID: 16870934]
  5. Bioinformatics. 2002 Sep;18(9):1184-93 [PMID: 12217910]
  6. Nucleic Acids Res. 2003 May 1;31(9):e52 [PMID: 12711697]
  7. Nature. 2002 Jan 24;415(6870):436-42 [PMID: 11807556]
  8. Bioinformatics. 2001 Jun;17(6):509-19 [PMID: 11395427]
  9. Proc Natl Acad Sci U S A. 2004 Mar 23;101(12):4164-9 [PMID: 15016911]
  10. Nature. 1999 Oct 21;401(6755):788-91 [PMID: 10548103]
  11. Bioinformatics. 2002 Apr;18(4):566-75 [PMID: 12016054]
  12. J Biomed Inform. 2004 Aug;37(4):293-303 [PMID: 15465482]
  13. Bioinformatics. 2007 Oct 1;23(19):2507-17 [PMID: 17720704]
  14. Neural Netw. 2002 Mar;15(2):285-95 [PMID: 12022515]

Word Cloud

Created with Highcharts 10.0.0microarraydatafeatureselectiongenesexpressionsmallnumberframeworktechniquebecomestandardmeanssimultaneouslyexaminingmeasureddifferentcircumstancestypicallycharacterizedhighdimensionalfeaturessamplesneedsincorporatedidentifysubsetmeaningfulbiologicalinterpretationaccountablesamplevariationarticlepresentsimpleyeteffectivesuitabletwo-dimensionalcorrelation-basednonparametricapproachallowscompactrepresentationclass-specificpropertiesevaluatedmethodusingpubliclyavailableexperimentalobtainedfavorableresultsEffectiveclusteranalysisclassificationclusteringgene

Similar Articles

Cited By