Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations.

Sander Greenland, Stephen J Senn, Kenneth J Rothman, John B Carlin, Charles Poole, Steven N Goodman, Douglas G Altman
Author Information
  1. Sander Greenland: Department of Epidemiology and Department of Statistics, University of California, Los Angeles, CA, USA. lesdomes@ucla.edu.
  2. Stephen J Senn: Competence Center for Methodology and Statistics, Luxembourg Institute of Health, Strassen, Luxembourg.
  3. Kenneth J Rothman: RTI Health Solutions, Research Triangle Institute, Research Triangle Park, NC, USA.
  4. John B Carlin: Clinical Epidemiology and Biostatistics Unit, Murdoch Children's Research Institute, School of Population Health, University of Melbourne, Melbourne, VIC, Australia.
  5. Charles Poole: Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA.
  6. Steven N Goodman: Meta-Research Innovation Center, Departments of Medicine and of Health Research and Policy, Stanford University School of Medicine, Stanford, CA, USA.
  7. Douglas G Altman: Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK.

Abstract

Misinterpretation and abuse of statistical tests, confidence intervals, and statistical power have been decried for decades, yet remain rampant. A key problem is that there are no interpretations of these concepts that are at once simple, intuitive, correct, and foolproof. Instead, correct use and interpretation of these statistics requires an attention to detail which seems to tax the patience of working scientists. This high cognitive demand has led to an epidemic of shortcut definitions and interpretations that are simply wrong, sometimes disastrously so-and yet these misinterpretations dominate much of the scientific literature. In light of this problem, we provide definitions and a discussion of basic statistics that are more general and critical than typically found in traditional introductory expositions. Our goal is to provide a resource for instructors, researchers, and consumers of statistics whose knowledge of statistical theory and technique may be limited but who wish to avoid and spot misinterpretations. We emphasize how violation of often unstated analysis protocols (such as selecting analyses for presentation based on the P values they produce) can lead to small P values even if the declared test hypothesis is correct, and can lead to large P values even if that hypothesis is incorrect. We then provide an explanatory list of 25 misinterpretations of P values, confidence intervals, and power. We conclude with guidelines for improving statistical interpretation and reporting.

Keywords

References

  1. Scand J Work Environ Health. 1997 Apr;23(2):152-4 [PMID: 9167239]
  2. Epidemiology. 1998 Jan;9(1):7-8 [PMID: 9430261]
  3. N Engl J Med. 1987 Aug 13;317(7):426-32 [PMID: 3614286]
  4. Stat Med. 2002 Aug 30;21(16):2437-44; author reply 2445-7 [PMID: 12210627]
  5. Am J Epidemiol. 1993 Mar 1;137(5):485-96; discussion 497-501 [PMID: 8465801]
  6. Int J Epidemiol. 2003 Oct;32(5):687-91 [PMID: 14559729]
  7. PLoS One. 2013 Jul 05;8(7):e66844 [PMID: 23861749]
  8. Psychol Bull. 1960 Sep;57:416-28 [PMID: 13744252]
  9. Epidemiology. 2013 Jan;24(1):62-8 [PMID: 23232611]
  10. Epidemiology. 2013 Jan;24(1):69-72 [PMID: 23232612]
  11. Stat Med. 1990 Jun;9(6):703-8 [PMID: 2218173]
  12. J Epidemiol Community Health. 2009 Aug;63(8):593-8 [PMID: 19596837]
  13. Ecology. 2014 Mar;95(3):611-7 [PMID: 24804441]
  14. Semin Hematol. 2008 Jul;45(3):135-40 [PMID: 18582619]
  15. Epidemiology. 1994 Mar;5(2):266-8; author reply 268-9 [PMID: 8173006]
  16. Ann Epidemiol. 2012 May;22(5):364-8 [PMID: 22391267]
  17. Nat Rev Neurosci. 2013 May;14(5):365-76 [PMID: 23571845]
  18. Epidemiology. 2001 May;12(3):291-4 [PMID: 11337599]
  19. Br Med J (Clin Res Ed). 1986 Mar 15;292(6522):716 [PMID: 3082408]
  20. Epidemiology. 2013 Jan;24(1):73-8 [PMID: 23232613]
  21. J Periodontol. 1972 Mar;43(3):181-3 [PMID: 4501976]
  22. Proc R Soc Med. 1965 May;58:295-300 [PMID: 14283879]
  23. Ann Intern Med. 1994 Aug 1;121(3):200-6 [PMID: 8017747]
  24. J Hepatol. 2007 Oct;47(4):506-13 [PMID: 17462781]
  25. Stat Med. 1996 Jun 30;15(12):1263-8; discussion 1269-72 [PMID: 8817800]
  26. Am J Public Health. 1988 Dec;78(12):1568-74 [PMID: 3189634]
  27. BMJ. 1996 Sep 28;313(7060):808 [PMID: 8842080]
  28. Br Heart J. 1988 Sep;60(3):177-80 [PMID: 3052552]
  29. Am J Clin Nutr. 2015 Nov;102(5):991-4 [PMID: 26354536]
  30. Am J Public Health. 1987 Apr;77(4):492-3 [PMID: 2950780]
  31. Cochrane Database Syst Rev. 2014 Oct 01;(10):MR000035 [PMID: 25271098]
  32. BMJ. 2014 Mar 31;348:g2215 [PMID: 24687314]
  33. Environ Health Perspect. 1981 Dec;42:15-21 [PMID: 7333252]
  34. Br Med J (Clin Res Ed). 1986 Mar 15;292(6522):746-50 [PMID: 3082422]
  35. Science. 1980 Sep 12;209(4462):1197-203 [PMID: 7403879]
  36. J Epidemiol Community Health. 2012 Nov;66(11):967-70 [PMID: 22268131]
  37. BMJ. 1995 Aug 19;311(7003):485 [PMID: 7647644]
  38. BMJ. 2001 Jan 27;322(7280):226-31 [PMID: 11159626]
  39. BMJ. 2010 Oct 12;341:c4737 [PMID: 20940209]
  40. Br J Pharmacol. 2012 Jul;166(5):1559-67 [PMID: 22394284]
  41. Prev Med. 2011 Oct;53(4-5):225-8 [PMID: 21871481]
  42. Lancet. 2009 Jun 6;373(9679):1926-8 [PMID: 19375158]
  43. J Epidemiol Biostat. 2001;6(2):193-204; discussion 205-10 [PMID: 11434499]
  44. BMJ. 2002 Nov 30;325(7375):1304 [PMID: 12458264]
  45. Psychol Bull. 1966 Dec;66(6):423-37 [PMID: 5974619]
  46. Stat Med. 1992 May;11(7):875-9 [PMID: 1604067]
  47. Int J Epidemiol. 2014 Dec;43(6):1969-85 [PMID: 25080530]
  48. Am J Public Health. 1987 Feb;77(2):195-9 [PMID: 3799860]
  49. Am J Public Health. 1987 Feb;77(2):191-4 [PMID: 3799859]
  50. Clin Trials. 2005;2(4):282-90; discussion 301-4, 364-78 [PMID: 16281426]
  51. Epidemiology. 1992 Sep;3(5):449-52 [PMID: 1391138]
  52. Epidemiology. 2001 May;12(3):288-90 [PMID: 11337598]
  53. Br J Clin Pharmacol. 1982 Sep;14(3):325-31 [PMID: 6751362]
  54. J Clin Oncol. 2012 Jan 10;30(2):210-6 [PMID: 22162583]
  55. Psychon Bull Rev. 2006 Dec;13(6):1033-7 [PMID: 17484431]
  56. Ann Intern Med. 1999 Jun 15;130(12):995-1004 [PMID: 10383371]
  57. Ann Intern Med. 1986 Sep;105(3):445-7 [PMID: 3740684]
  58. N Engl J Med. 1978 Dec 14;299(24):1362-3 [PMID: 362205]
  59. BMC Med. 2013 Apr 18;11:108 [PMID: 23597181]
  60. Am J Public Health. 1986 May;76(5):556-8 [PMID: 3963285]
  61. Psychon Bull Rev. 2016 Feb;23(1):103-23 [PMID: 26450628]
  62. Psychon Bull Rev. 2007 Oct;14(5):779-804 [PMID: 18087943]
  63. Pharm Stat. 2015 Mar-Apr;14(2):139-50 [PMID: 25641830]

Grants

  1. 16895/Cancer Research UK

MeSH Term

Confidence Intervals
Data Interpretation, Statistical
Humans
Probability

Word Cloud

Created with Highcharts 10.0.0PvaluesstatisticalintervalsmisinterpretationstestsconfidencecorrectstatisticsprovidetestingpoweryetprobleminterpretationsinterpretationdefinitionscanleadevenhypothesisStatisticalMisinterpretationabusedecrieddecadesremainrampantkeyconceptssimpleintuitivefoolproofInsteaduserequiresattentiondetailseemstaxpatienceworkingscientistshighcognitivedemandledepidemicshortcutsimplywrongsometimesdisastrouslyso-anddominatemuchscientificliteraturelightdiscussionbasicgeneralcriticaltypicallyfoundtraditionalintroductoryexpositionsgoalresourceinstructorsresearchersconsumerswhoseknowledgetheorytechniquemaylimitedwishavoidspotemphasizeviolationoftenunstatedanalysisprotocolsselectinganalysespresentationbasedproducesmalldeclaredtestlargeincorrectexplanatorylist25concludeguidelinesimprovingreportingpower:guideConfidenceHypothesisNullvaluePowerSignificance

Similar Articles

Cited By (725)