How to select predictive models for decision-making or causal inference.

Matthieu Doutreligne, Gaël Varoquaux
Author Information
  1. Matthieu Doutreligne: Soda, Inria Saclay, 91120, Palaiseau, France. ORCID
  2. Gaël Varoquaux: Soda, Inria Saclay, 91120, Palaiseau, France. ORCID

Abstract

BACKGROUND: We investigate which procedure selects the most trustworthy predictive model to explain the effect of an intervention and support decision-making.
METHODS: We study a large variety of model selection procedures in practical settings: finite samples settings and without a theoretical assumption of well-specified models. Beyond standard cross-validation or internal validation procedures, we also study elaborate causal risks. These build proxies of the causal error using "nuisance" reweighting to compute it on the observed data. We evaluate whether empirically estimated nuisances, which are necessarily noisy, add noise to model selection and compare different metrics for causal model selection in an extensive empirical study based on a simulation and 3 health care datasets based on real covariates.
RESULTS: Among all metrics, the mean squared error, classically used to evaluate predictive modes, is worse. Reweighting it with a propensity score does not bring much improvement in most cases. On average, the $R\text{-risk}$, which uses as nuisances a model of mean outcome and propensity scores, leads to the best performances. Nuisance corrections are best estimated with flexible estimators such as a super learner.
CONCLUSIONS: When predictive models are used to explain the effect of an intervention, they must be evaluated with different procedures than standard predictive settings, using the $R\text{-risk}$ from causal inference.

Keywords

References

  1. Int J Epidemiol. 2021 Jan 23;49(6):2058-2064 [PMID: 31298274]
  2. JAMA. 2018 Apr 3;319(13):1317-1318 [PMID: 29532063]
  3. Am J Epidemiol. 2017 Jan 1;185(1):65-73 [PMID: 27941068]
  4. Clin Orthop Relat Res. 2019 Jun;477(6):1267-1279 [PMID: 31094833]
  5. Proc Natl Acad Sci U S A. 2016 Jul 5;113(27):7353-60 [PMID: 27382149]
  6. Mon Vital Stat Rep. 1998 Feb 26;46(6 Suppl 2):1-22 [PMID: 9524421]
  7. Am J Epidemiol. 2023 May 5;192(5):685-687 [PMID: 36653907]
  8. Am J Epidemiol. 1986 Mar;123(3):392-402 [PMID: 3946386]
  9. Am J Psychiatry. 2018 Oct 1;175(10):951-960 [PMID: 29792051]
  10. Sci Rep. 2019 Aug 29;9(1):12495 [PMID: 31467326]
  11. JAMA Psychiatry. 2020 May 1;77(5):534-540 [PMID: 31774490]
  12. JAMA Netw Open. 2020 Jan 3;3(1):e1918962 [PMID: 31922560]
  13. BMC Med Res Methodol. 2023 Jan 17;23(1):18 [PMID: 36647031]
  14. Eur J Epidemiol. 2019 Mar;34(3):211-219 [PMID: 30840181]
  15. Stat Med. 2018 Oct 15;37(23):3309-3324 [PMID: 29862536]
  16. N Engl J Med. 2019 Apr 4;380(14):1347-1358 [PMID: 30943338]
  17. Artif Intell Med. 2022 Aug;130:102332 [PMID: 35809971]
  18. Stat Methods Med Res. 2018 Jan;27(1):142-157 [PMID: 26988928]
  19. Proc Natl Acad Sci U S A. 2019 Mar 5;116(10):4156-4165 [PMID: 30770453]
  20. Artif Intell Med. 2022 May;127:102276 [PMID: 35430037]
  21. Sci Rep. 2021 Jan 14;11(1):1435 [PMID: 33446866]
  22. BMC Med Res Methodol. 2019 Mar 6;19(1):46 [PMID: 30841848]
  23. J Am Med Inform Assoc. 2018 Jun 1;25(6):670-678 [PMID: 29202188]
  24. Radiology. 2019 Jul;292(1):60-66 [PMID: 31063083]
  25. J Am Med Inform Assoc. 2019 Oct 1;26(10):977-988 [PMID: 31220274]
  26. PLoS One. 2017 Apr 6;12(4):e0174708 [PMID: 28384212]
  27. J Chronic Dis. 1987;40(5):373-83 [PMID: 3558716]
  28. JMIR Med Inform. 2020 Mar 31;8(3):e17984 [PMID: 32229465]
  29. Epidemiology. 2011 Nov;22(6):874-5 [PMID: 21968779]
  30. BMJ. 2009 Mar 31;338:b604 [PMID: 19336487]
  31. Stat Med. 2021 Nov 20;40(26):5961-5981 [PMID: 34402094]
  32. Stat Med. 2018 Jul 30;37(17):2547-2560 [PMID: 29707855]
  33. Am J Epidemiol. 2011 Apr 1;173(7):731-8 [PMID: 21415029]
  34. Stat Med. 2015 Dec 10;34(28):3661-79 [PMID: 26238958]
  35. Comput Biol Med. 2019 May;108:354-370 [PMID: 31054502]
  36. BMJ. 2009 May 28;338:b605 [PMID: 19477892]
  37. Artif Intell Med. 2020 Nov;110:101977 [PMID: 33250149]
  38. Stat Med. 2018 May 20;37(11):1767-1787 [PMID: 29508417]
  39. J Am Med Inform Assoc. 2019 Dec 1;26(12):1675-1676 [PMID: 31722385]
  40. J Am Coll Surg. 2020 Jan;230(1):101-112.e2 [PMID: 31672675]
  41. Annu Rev Public Health. 2018 Apr 1;39:95-112 [PMID: 29261408]

MeSH Term

Humans
Decision Making
Models, Statistical
Causality
Propensity Score
Computer Simulation

Word Cloud

Created with Highcharts 10.0.0modelpredictivecausalstudyselectionproceduresmodelsexplaineffectinterventiondecision-makingsettingsstandarderrorusingevaluateestimatednuisancesdifferentmetricsbasedmeanusedpropensity$R\text{-risk}$bestinferenceBACKGROUND:investigateprocedureselectstrustworthysupportMETHODS:largevarietypracticalsettings:finitesampleswithouttheoreticalassumptionwell-specifiedBeyondcross-validationinternalvalidationalsoelaboraterisksbuildproxies"nuisance"reweightingcomputeobserveddatawhetherempiricallynecessarilynoisyaddnoisecompareextensiveempiricalsimulation3healthcaredatasetsrealcovariatesRESULTS:AmongsquaredclassicallymodesworseReweightingscorebringmuchimprovementcasesaverageusesoutcomescoresleadsperformancesNuisancecorrectionsflexibleestimatorssuperlearnerCONCLUSIONS:mustevaluatedselectG-computationMachineLearningModelSelectionPredictiveTreatmentEffect

Similar Articles

Cited By