Sequential Monte Carlo multiple testing.

Geir Kjetil Sandve, Egil Ferkingstad, Ståle Nygård
Author Information
  1. Geir Kjetil Sandve: Department of Informatics, University of Oslo, Oslo, Norway. geirksa@ifi.uio.no

Abstract

MOTIVATION: In molecular biology, as in many other scientific fields, the scale of analyses is ever increasing. Often, complex Monte Carlo simulation is required, sometimes within a large-scale multiple testing setting. The resulting computational costs may be prohibitively high.
RESULTS: We here present MCFDR, a simple, novel algorithm for false discovery rate (FDR) modulated sequential Monte Carlo (MC) multiple hypothesis testing. The algorithm iterates between adding MC samples across tests and calculating intermediate FDR values for the collection of tests. MC sampling is stopped either by sequential MC or based on a threshold on FDR. An essential property of the algorithm is that it limits the total number of MC samples whatever the number of true null hypotheses. We show on both real and simulated data that the proposed algorithm provides large gains in computational efficiency.
AVAILABILITY: MCFDR is implemented in the Genomic HyperBrowser (http://hyperbrowser.uio.no/mcfdr), a web-based system for genome analysis. All input data and results are available and can be reproduced through a Galaxy Pages document at: http://hyperbrowser.uio.no/mcfdr/u/sandve/p/mcfdr.
CONTACT: geirksa@ifi.uio.no.

References

  1. Cancer Inform. 2008;6:25-32 [PMID: 19259400]
  2. Nat Genet. 2008 Jul;40(7):897-903 [PMID: 18552846]
  3. Biostatistics. 2008 Oct;9(4):601-12 [PMID: 18304995]
  4. Genome Res. 2010 Nov;20(11):1493-502 [PMID: 20841431]
  5. Nat Biotechnol. 2008 Oct;26(10):1135-45 [PMID: 18846087]
  6. Genome Biol. 2010;11(8):R86 [PMID: 20738864]
  7. Cell. 2007 May 18;129(4):823-37 [PMID: 17512414]
  8. Genome Biol. 2010;11(12):R121 [PMID: 21182759]
  9. Bioinformatics. 2005 Mar;21(6):781-7 [PMID: 15454414]
  10. Nature. 2007 Jun 14;447(7146):799-816 [PMID: 17571346]
  11. Stat Appl Genet Mol Biol. 2010;9:Article39 [PMID: 21044043]
  12. PLoS One. 2011 Apr 22;6(4):e18874 [PMID: 21526119]
  13. Am J Hum Genet. 2002 Aug;71(2):439-41 [PMID: 12111669]
  14. Nat Methods. 2009 Nov;6(11 Suppl):S2-5 [PMID: 19844227]
  15. Brief Bioinform. 2010 Mar;11(2):181-97 [PMID: 19864250]
  16. Bioinformatics. 2003 Jul 1;19(10):1236-42 [PMID: 12835267]
  17. Bioinformatics. 2006 Aug 15;22(16):1979-87 [PMID: 16777905]
  18. Genome Res. 2009 Feb;19(2):221-33 [PMID: 19047520]
  19. Am J Hum Genet. 2005 Mar;76(3):399-408 [PMID: 15645388]

MeSH Term

Algorithms
Computer Simulation
Genome-Wide Association Study
Genomics
Histone Code
Monte Carlo Method

Word Cloud

Created with Highcharts 10.0.0MCalgorithmMonteCarlomultipletestingFDRuiocomputationalMCFDRsequentialsamplestestsnumberdatahttp://hyperbrowserMOTIVATION:molecularbiologymanyscientificfieldsscaleanalyseseverincreasingOftencomplexsimulationrequiredsometimeswithinlarge-scalesettingresultingcostsmayprohibitivelyhighRESULTS:presentsimplenovelfalsediscoveryratemodulatedhypothesisiteratesaddingacrosscalculatingintermediatevaluescollectionsamplingstoppedeitherbasedthresholdessentialpropertylimitstotalwhatevertruenullhypothesesshowrealsimulatedproposedprovideslargegainsefficiencyAVAILABILITY:implementedGenomicHyperBrowserno/mcfdrweb-basedsystemgenomeanalysisinputresultsavailablecanreproducedGalaxyPagesdocumentat:no/mcfdr/u/sandve/p/mcfdrCONTACT:geirksa@ifinoSequential

Similar Articles

Cited By