Microarray data simulator for improved selection of differentially expressed genes.

Sunil Singhal, Chris G Kyvernitis, Steven W Johnson, Larry R Kaiser, Michael N Liebman, Steven M Albelda
Author Information
  1. Sunil Singhal: Section of Thoracic Surgery, Division of Cardiothoracic Surgery, University of Pennsylvania School of Medicine; Philadelphia, Pennsylvania USA.

Abstract

The development of microarray technology has allowed researchers to measure expression levels of thousands of genes simultaneously. Analysis of these data requires the best normalization and statistical approaches to account for the biological and technical variability inherent in the technique. To approach this problem we have developed a publicly available simulator of microarray hybridization experiments that can be used to help assess the accuracy of bioinformatic tools in discovering significant genes. After analyzing microarray hybridization experiments from over 50 samples, an estimate of various degrees of technical and biological variability was obtained. This information was used to develop a simulator of microarray hybridization data which modeled "normal tissue samples" and "diseased tissue samples" with known, defined, changes in gene expression (a "gold standard"). The data derived from the simulator were then used to evaluate the true positive and false negative rates of several normalization procedures and gene selection techniques. We found that the type of normalization approach used was an important aspect of data analysis. Global normalization was the least accurate approach. Evaluation of gene selection techniques showed that "Significance analysis of microarrays" (SAM) and "Patterns of Gene Expression" (PaGE) were more accurate than simple t-test analysis. We provide access to the microarray hybridization simulator as a public resource for biologists to further test new emerging genomic bioinfomatic tools.

Grants

  1. P01 CA066726/NCI NIH HHS
  2. R25 CA087812/NCI NIH HHS
  3. R25-CA87812/NCI NIH HHS

MeSH Term

Computer Simulation
DNA, Neoplasm
Female
Gene Expression Profiling
Gene Expression Regulation, Neoplastic
Humans
Neoplasms
Oligonucleotide Array Sequence Analysis

Chemicals

DNA, Neoplasm

Word Cloud

Created with Highcharts 10.0.0microarraydatasimulatornormalizationhybridizationusedgenesapproachgeneselectionanalysisexpressionbiologicaltechnicalvariabilityexperimentstoolstissuesamples"techniquesaccuratedevelopmenttechnologyallowedresearchersmeasurelevelsthousandssimultaneouslyAnalysisrequiresbeststatisticalapproachesaccountinherenttechniqueproblemdevelopedpubliclyavailablecanhelpassessaccuracybioinformaticdiscoveringsignificantanalyzing50samplesestimatevariousdegreesobtainedinformationdevelopmodeled"normal"diseasedknowndefinedchanges"goldstandard"derivedevaluatetruepositivefalsenegativeratesseveralproceduresfoundtypeimportantaspectGloballeastEvaluationshowed"Significancemicroarrays"SAM"PatternsGeneExpression"PaGEsimplet-testprovideaccesspublicresourcebiologiststestnewemerginggenomicbioinfomaticMicroarrayimproveddifferentiallyexpressed

Similar Articles

Cited By