Applying machine learning to predict real-world individual treatment effects: insights from a virtual patient cohort.

Gang Fang, Izabela E Annis, Jennifer Elston-Lafata, Samuel Cykert
Author Information
  1. Gang Fang: Division of Pharmaceutical Outcomes and Policy, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.
  2. Izabela E Annis: Division of Pharmaceutical Outcomes and Policy, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.
  3. Jennifer Elston-Lafata: Division of Pharmaceutical Outcomes and Policy, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.
  4. Samuel Cykert: Program for Health and Clinical Informatics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.

Abstract

OBJECTIVE: We aimed to investigate bias in applying machine learning to predict real-world individual treatment effects.
MATERIALS AND METHODS: Using a virtual patient cohort, we simulated real-world healthcare data and applied random forest and gradient boosting classifiers to develop prediction models. Treatment effect was estimated as the difference between the predicted outcomes of a treatment and a control. We evaluated the impact of predictors (ie, treatment predictors [X1], confounders [X2], treatment effects modifiers [X3], and other outcome risk factors [X4]) with known effects on treatment and outcome using real-world data, and outcome imbalance on predicting individual outcome. Using counterfactuals, we evaluated percentage of patients with biased predicted individual treatment effects.
RESULTS: The X4 had relatively more impact on model performance than X2 and X3 did. No effects were observed from X1. Moderate-to-severe outcome imbalance had a significantly negative impact on model performance, particularly among subgroups in which an outcome occurred. Bias in predicting individual treatment effects was significant and persisted even when the models had a 100% accuracy in predicting health outcome.
DISCUSSION: Inadequate inclusion of the X2, X3, and X4 and moderate-to-severe outcome imbalance may affect model performance in predicting individual outcome and subsequently bias in predicting individual treatment effects. Machine learning models with all features and high performance for predicting individual outcome still yielded biased individual treatment effects.
CONCLUSIONS: Direct application of machine learning might not adequately address bias in predicting individual treatment effects. Further method development is needed to advance machine learning to support individualized treatment selection.

Keywords

References

  1. Ann Intern Med. 1997 Oct 15;127(8 Pt 2):757-63 [PMID: 9382394]
  2. N Engl J Med. 2015 Feb 26;372(9):793-5 [PMID: 25635347]
  3. JAMA. 2007 Sep 12;298(10):1209-12 [PMID: 17848656]
  4. JAMA. 2018 Apr 3;319(13):1317-1318 [PMID: 29532063]
  5. Circulation. 2015 Nov 17;132(20):1920-30 [PMID: 26572668]
  6. N Engl J Med. 2016 Sep 29;375(13):1216-9 [PMID: 27682033]
  7. J Chem Inf Comput Sci. 2003 Mar-Apr;43(2):579-86 [PMID: 12653524]
  8. Lancet. 1995 Jun 24;345(8965):1616-9 [PMID: 7783541]
  9. Milbank Q. 2004;82(4):661-87 [PMID: 15595946]
  10. JAMA. 2015 Jun 2;313(21):2119-20 [PMID: 25928209]
  11. Clin Ther. 2009 Apr;31(4):902-19 [PMID: 19446162]
  12. JAMA Ophthalmol. 2016 Aug 1;134(8):928-33 [PMID: 27228338]
  13. Bioinformatics. 2005 Aug 1;21(15):3301-7 [PMID: 15905277]
  14. Am J Epidemiol. 2012 Jan 1;175(1):60-5 [PMID: 22085626]
  15. J Am Heart Assoc. 2015 Apr 10;4(4): [PMID: 25862791]
  16. JAMA Netw Open. 2018 Jul 6;1(3):e180926 [PMID: 30646043]
  17. Value Health. 2015 Mar;18(2):137-40 [PMID: 25773546]
  18. Stat Med. 2004 Oct 15;23(19):2937-60 [PMID: 15351954]
  19. J Am Coll Cardiol. 2017 Sep 26;70(13):1543-1554 [PMID: 28935030]
  20. Pharmacotherapy. 2018 Jan;38(1):29-41 [PMID: 29059475]
  21. JAMA. 2016 Feb 9;315(6):551-2 [PMID: 26864406]
  22. Br J Gen Pract. 1998 Apr;48(429):1173-8 [PMID: 9667097]
  23. JAMA Netw Open. 2018 Aug 3;1(4):e181018 [PMID: 30646095]
  24. J Am Heart Assoc. 2017 Oct 19;6(10): [PMID: 29051213]
  25. BMJ. 2016 Jan 25;352:i6 [PMID: 26810254]
  26. Science. 1988 Jun 3;240(4857):1285-93 [PMID: 3287615]

Grants

  1. R01 AG046267/NIA NIH HHS
  2. R21 AG043668/NIA NIH HHS

MeSH Term

Cohort Studies
Computer Simulation
Humans
Machine Learning
Outcome and Process Assessment, Health Care
Precision Medicine
Prognosis
Treatment Outcome

Word Cloud

Created with Highcharts 10.0.0treatmentindividualoutcomeeffectspredictinglearningmachinereal-worldperformancebiasvirtualpatientcohortmodelsimpactimbalancemodelpredictUsingdatapredictedevaluatedpredictorsbiasedX4X2X3OBJECTIVE:aimedinvestigateapplyingMATERIALSANDMETHODS:simulatedhealthcareappliedrandomforestgradientboostingclassifiersdeveloppredictionTreatmenteffectestimateddifferenceoutcomescontrolie[X1]confounders[X2]modifiers[X3]riskfactors[X4]knownusingcounterfactualspercentagepatientsRESULTS:relativelyobservedX1Moderate-to-severesignificantlynegativeparticularlyamongsubgroupsoccurredBiassignificantpersistedeven100%accuracyhealthDISCUSSION:Inadequateinclusionmoderate-to-severemayaffectsubsequentlyMachinefeatureshighstillyieldedCONCLUSIONS:DirectapplicationmightadequatelyaddressmethoddevelopmentneededadvancesupportindividualizedselectionApplyingeffects:insightscomparativeeffectivenessprecisionmedicineevidence

Similar Articles

Cited By (12)