Beyond the one-way ANOVA for 'omics data.

Kirsty L Hassall, Andrew Mead
Author Information
  1. Kirsty L Hassall: Computational and Analytical Sciences, Rothamsted Research, Harpenden, AL5 2JQ, UK. kirsty.hassall@rothamsted.ac.uk.
  2. Andrew Mead: Computational and Analytical Sciences, Rothamsted Research, Harpenden, AL5 2JQ, UK.

Abstract

BACKGROUND: With ever increasing accessibility to high throughput technologies, more complex treatment structures can be assessed in a variety of 'omics applications. This adds an extra layer of complexity to the analysis and interpretation, in particular when inferential univariate methods are applied en masse. It is well-known that mass univariate testing suffers from multiplicity issues and although this has been well documented for simple comparative tests, few approaches have focussed on more complex explanatory structures.
RESULTS: Two frameworks are introduced incorporating corrections for multiplicity whilst maintaining appropriate structure in the explanatory variables. Within this paradigm, a choice has to be made as to whether multiplicity corrections should be applied to the saturated model, putting emphasis on controlling the rate of false positives, or to the predictive model, where emphasis is on model selection. This choice has implications for both the ranking and selection of the response variables identified as differentially expressed. The theoretical difference is demonstrated between the two approaches along with an empirical study of lipid composition in Arabidopsis under differing levels of salt stress.
CONCLUSIONS: Multiplicity corrections have an inherent weakness when the full explanatory structure is not properly incorporated. Although a unifying 'single best' recommendation is not provided, two reasonable alternatives are provided and the applicability of these approaches is discussed for different scenarios where the aims of analysis will differ. The key result is that the point at which multiplicity is incorporated into the analysis will fundamentally change the interpretation of the results, and the choice of approach should therefore be driven by the specific aims of the experiment.

Keywords

References

  1. Plant J. 2014 Nov;80(4):728-43 [PMID: 25200898]
  2. Stat Appl Genet Mol Biol. 2006;5:Article28 [PMID: 17402912]
  3. Bioinformatics. 2009 Apr 15;25(8):1019-25 [PMID: 19213738]
  4. Plant Physiol. 2015 Mar;167(3):1158-85 [PMID: 25596183]
  5. Heredity (Edinb). 2005 Sep;95(3):221-7 [PMID: 16077740]
  6. Nucleic Acids Res. 2015 Apr 20;43(7):e47 [PMID: 25605792]
  7. Anal Chim Acta. 2015 Jul 23;885:1-16 [PMID: 26231889]
  8. Cell Res. 2012 May;22(5):806-21 [PMID: 22349460]
  9. Proteomics. 2016 Nov;16(22):2894-2910 [PMID: 27588558]
  10. J Agric Food Chem. 2009 Feb 11;57(3):1013-21 [PMID: 19143525]
  11. PLoS One. 2014 Sep 29;9(9):e108431 [PMID: 25265161]
  12. Mol Biosyst. 2013 Nov;9(11):2589-96 [PMID: 23999822]
  13. Genet Epidemiol. 2008 Sep;32(6):567-73 [PMID: 18425821]
  14. Sci Rep. 2015 May 27;5:10533 [PMID: 26013835]
  15. Nat Prod Rep. 2013 Apr;30(4):565-83 [PMID: 23447050]
  16. Biometrics. 2008 Dec;64(4):1215-22 [PMID: 18261164]
  17. Proteomics. 2017 Jun;17(12): [PMID: 28471538]
  18. Genome Res. 2010 Jun;20(6):847-60 [PMID: 20452967]
  19. Genome Biol. 2017 Aug 7;18(1):151 [PMID: 28784146]
  20. Heredity (Edinb). 2001 Jul;87(Pt 1):52-8 [PMID: 11678987]
  21. J Agric Food Chem. 2017 Jul 5;65(26):5427-5434 [PMID: 28614658]

Grants

  1. BBS/E/C/000I0210/Biotechnology and Biological Sciences Research Council
  2. BBS/E/C/000I0420/Biotechnology and Biological Sciences Research Council

MeSH Term

Analysis of Variance
Arabidopsis
Computer Simulation
Genomics
Humans
Lipids
Metabolome
Metabolomics
Sodium Chloride

Chemicals

Lipids
Sodium Chloride

Word Cloud

Created with Highcharts 10.0.0multiplicityanalysisapproachesexplanatorycorrectionschoicemodelselectioncomplexstructures'omicsinterpretationunivariateappliedstructurevariablesemphasistwoMultiplicityincorporatedprovidedaimswillANOVABACKGROUND:everincreasingaccessibilityhighthroughputtechnologiestreatmentcanassessedvarietyapplicationsaddsextralayercomplexityparticularinferentialmethodsenmassewell-knownmasstestingsuffersissuesalthoughwelldocumentedsimplecomparativetestsfocussedRESULTS:TwoframeworksintroducedincorporatingwhilstmaintainingappropriateWithinparadigmmadewhethersaturatedputtingcontrollingratefalsepositivespredictiveimplicationsrankingresponseidentifieddifferentiallyexpressedtheoreticaldifferencedemonstratedalongempiricalstudylipidcompositionArabidopsisdifferinglevelssaltstressCONCLUSIONS:inherentweaknessfullproperlyAlthoughunifying'singlebest'recommendationreasonablealternativesapplicabilitydiscusseddifferentscenariosdifferkeyresultpointfundamentallychangeresultsapproachthereforedrivenspecificexperimentBeyondone-waydataModel’omics

Similar Articles

Cited By