Causal AI with Real World Data: Do Statins Protect from Alzheimer's Disease Onset?

Mattia Prosperi, Marco Salemi, Shantanu Ghosh, Tianchen Lyu, Jiang Bian, Zhaoyi Chen, Jinying Zhao
Author Information
  1. Mattia Prosperi: Department of Epidemiology, University of Florida.
  2. Marco Salemi: Department of Pathology, University of Florida.
  3. Shantanu Ghosh: Department of Epidemiology, University of Florida.
  4. Tianchen Lyu: Department of Health Outcomes and Biomedical Informatics, University of Florida.
  5. Jiang Bian: Department of Health Outcomes and Biomedical Informatics, University of Florida.
  6. Zhaoyi Chen: Department of Health Outcomes and Biomedical Informatics, University of Florida.
  7. Jinying Zhao: Department of Epidemiology, University of Florida.

Abstract

Causal artificial intelligence aims at developing bias-robust models that can be used to intervene on, rather than just be predictive, of risks or outcomes. However, learning interventional models from observational data, including electronic health records (EHR), is challenging due to inherent bias, e.g., protopathic, confounding, collider. When estimating the effects of treatment interventions, classical approaches like propensity score matching are often used, but they pose limitations with large feature sets, nonlinear/nonparallel treatment group assignments, and collider bias. In this work, we used data from a large EHR consortium -OneFlorida- and evaluated causal statistical/machine learning methods for determining the effect of statin treatment on the risk of Alzheimer's disease, a debated clinical research question. We introduced a combination of directed acyclic graph (DAG) learning and comparison with expert's design, with calculation of the generalized adjustment criterion (GAC), to find an optimal set of covariates for estimation of treatment effects -ameliorating collider bias. The DAG/CAC approach was assessed together with traditional propensity score matching, inverse probability weighting, virtual-twin/counterfactual random forests, and deep counterfactual networks. We showed large heterogeneity in effect estimates upon different model configurations. Our results did not exclude a protective effect of statins, where the DAG/GAC point estimate aligned with the maximum credibility estimate, although the 95% credibility interval included a null effect, warranting further studies and replication.

Keywords

References

  1. Neurology. 2011 Aug 9;77(6):556-63 [PMID: 21795660]
  2. J Am Coll Cardiol. 2017 Jan 24;69(3):345-357 [PMID: 28104076]
  3. J Am Med Inform Assoc. 2019 Oct 1;26(10):977-988 [PMID: 31220274]
  4. Vasc Health Risk Manag. 2008;4(2):363-81 [PMID: 18561512]
  5. Am J Epidemiol. 2016 Apr 15;183(8):758-64 [PMID: 26994063]
  6. Am J Epidemiol. 2002 Mar 15;155(6):487-95 [PMID: 11882522]
  7. Multivariate Behav Res. 2011 May;46(3):399-424 [PMID: 21818162]
  8. Epidemiology. 2009 Jul;20(4):512-22 [PMID: 19487948]
  9. Alzheimers Res Ther. 2017 Feb 17;9(1):10 [PMID: 28212683]
  10. J Am Med Inform Assoc. 2019 Dec 1;26(12):1675-1676 [PMID: 31722385]
  11. Front Genet. 2019 Jun 04;10:524 [PMID: 31214249]
  12. Transl Neurodegener. 2018 Feb 27;7:5 [PMID: 29507718]
  13. Pharmacoepidemiol Drug Saf. 2017 Mar;26(3):294-300 [PMID: 27527987]
  14. Neurology. 2013 May 7;80(19):1778-83 [PMID: 23390181]
  15. Neuroepidemiology. 2020;54(3):214-226 [PMID: 31574510]
  16. J Comput Graph Stat. 2018;27(1):209-219 [PMID: 29706752]
  17. Nat Rev Drug Discov. 2019 Jan;18(1):41-58 [PMID: 30310233]
  18. Int J Epidemiol. 2018 Dec 1;47(6):2005-2014 [PMID: 29939268]
  19. Medicine (Baltimore). 2018 Jul;97(30):e11304 [PMID: 30045255]
  20. Sci Rep. 2018 Apr 11;8(1):5804 [PMID: 29643479]
  21. Stat Med. 2011 Oct 30;30(24):2867-80 [PMID: 21815180]
  22. Stat Med. 2009 Apr 30;28(9):1415-6; author reply 1420-3 [PMID: 19340847]
  23. J Chronic Dis. 1979;32(1-2):51-63 [PMID: 447779]
  24. BMC Evol Biol. 2013 Oct 04;13:221 [PMID: 24093883]

Grants

  1. U18 DP006512/NCCDPHP CDC HHS
  2. R21 CA245858/NCI NIH HHS
  3. UL1 TR001427/NCATS NIH HHS
  4. R01 CA246418/NCI NIH HHS
  5. U18DP006512/ACL HHS
  6. R21 AG068717/NIA NIH HHS

Word Cloud

Created with Highcharts 10.0.0treatmenteffectlearningCausalusedbiascolliderlargeartificialintelligencemodelsdataelectronicrecordsEHReffectspropensityscorematchingAlzheimer'sdirectedacyclicgraphgeneralizedadjustmentcriterionestimatecredibilityaimsdevelopingbias-robustcaninterveneratherjustpredictiverisksoutcomesHoweverinterventionalobservationalincludinghealthchallengingdueinherentegprotopathicconfoundingestimatinginterventionsclassicalapproacheslikeoftenposelimitationsfeaturesetsnonlinear/nonparallelgroupassignmentsworkconsortium-OneFlorida-evaluatedcausalstatistical/machinemethodsdeterminingstatinriskdiseasedebatedclinicalresearchquestionintroducedcombinationDAGcomparisonexpert'sdesigncalculationGACfindoptimalsetcovariatesestimation-amelioratingDAG/CACapproachassessedtogethertraditionalinverseprobabilityweightingvirtual-twin/counterfactualrandomforestsdeepcounterfactualnetworksshowedheterogeneityestimatesupondifferentmodelconfigurationsresultsexcludeprotectivestatinsDAG/GACpointalignedmaximumalthough95%intervalincludednullwarrantingstudiesreplicationAIRealWorldData:StatinsProtectDiseaseOnset?Bayesiannetworkbiomedicalinformaticsmedicalmachine

Similar Articles

Cited By