Targeted maximum likelihood estimation for causal inference in survival and competing risks analysis.

Helene C W Rytgaard, Mark J van der Laan
Author Information
  1. Helene C W Rytgaard: Section of Biostatistics, University of Copenhagen, Copenhagen, Denmark. hely@sund.ku.dk. ORCID
  2. Mark J van der Laan: Division of Biostatistics, Center for Targeted Machine, Berkeley, CA, USA.

Abstract

Targeted maximum likelihood estimation (TMLE) provides a general methodology for estimation of causal parameters in presence of high-dimensional nuisance parameters. Generally, TMLE consists of a two-step procedure that combines data-adaptive nuisance parameter estimation with semiparametric efficiency and rigorous statistical inference obtained via a targeted update step. In this paper, we demonstrate the practical applicability of TMLE based causal inference in survival and competing risks settings where event times are not confined to take place on a discrete and finite grid. We focus on estimation of causal effects of time-fixed treatment decisions on survival and absolute risk probabilities, considering different univariate and multidimensional parameters. Besides providing a general guidance to using TMLE for survival and competing risks analysis, we further describe how the previous work can be extended with the use of loss-based cross-validated estimation, also known as super learning, of the conditional hazards. We illustrate the usage of the considered methods using publicly available data from a trial on adjuvant chemotherapy for colon cancer. R software code to implement all considered algorithms and to reproduce all analyses is available in an accompanying online appendix on Github.

Keywords

References

  1. Andersen PK, Borgan O, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer, New York [DOI: 10.1007/978-1-4612-4348-9]
  2. Benkeser D, Carone M, Gilbert PB (2018) Improved estimation of the cumulative incidence of rare outcomes. Stat Med 37(2):280–293 [DOI: 10.1002/sim.7337]
  3. Benkeser D, van der Laan M (2016) The highly adaptive lasso estimator. In: Proceedings of the... International conference on data science and advanced analytics. IEEE international conference on data science and advanced analytics, NIH Public Access, vol 2016, p 689
  4. Bibaut AF, van der Laan MJ (2019) Fast rates for empirical risk minimization over càdlàg functions with bounded sectional variation norm. arXiv e-prints arXiv:1907.09244 , 1907.09244
  5. Bickel PJ, Klaassen CAJ, Ritov Y, Wellner JA (1993) Efficient and adaptive inference in semiparametric models. Indian J Stat 62:157–160
  6. Cai W, van der Laan MJ (2019) One-step targeted maximum likelihood estimation for time-to-event outcomes. Biometrics 76(3):722–733 [DOI: 10.1111/biom.13172]
  7. Cox DR (1972) Regression models and life-tables. J R Stat Soc: Ser B (Methodol) 34(2):187–202
  8. Cox DR (1975) Partial likelihood. Biometrika 62(2):269–276 [DOI: 10.1093/biomet/62.2.269]
  9. Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22 [DOI: 10.18637/jss.v033.i01]
  10. Gill RD, van der Laan MJ, Wellner JA (1995) Inefficient estimators of the bivariate survival function for three models. Annales de l’Institut Henri Poincaré 31:545–597
  11. Gill RD, Robins JM (2001) Causal inference for complex longitudinal data: the continuous case. Ann Stat 29:1785–1811 [DOI: 10.1214/aos/1015345962]
  12. Gray RJ (1988) A class of [Formula: see text]-sample tests for comparing the cumulative incidence of a competing risk. Ann Stat 16(3):1141–1154 [DOI: 10.1214/aos/1176350951]
  13. Hernán MA (2010) The hazards of hazard ratios. Epidemiology 21(1):13 [DOI: 10.1097/EDE.0b013e3181c1ea43]
  14. Hernan MA, Robins JM (2020) Causal inference. Chapman & Hall/CRC, Boca Raton
  15. Hubbard AE, van der L MJ, Robins JM (2000) Nonparametric locally efficient estimation of the treatment specific survival distribution with right censored data and covariates in observational studies. In: Statistical models in epidemiology, the environment, and clinical trials, Springer, Berlin, pp 135–177
  16. Martinussen T, Vansteelandt S, Andersen PK (2018) Subtleties in the interpretation of hazard ratios. arXiv preprint arXiv:1810.09192
  17. Moertel CG, Fleming TR, Macdonald JS, Haller DG, Laurie JA, Goodman PJ, Ungerleider JS, Emerson WA, Tormey DC, Glick JH, Veeder MH, Mailliard JA (1990) Levamisole and fluorouracil for adjuvant therapy of resected colon carcinoma. N Engl J Med 322(6):352–358 [DOI: 10.1056/NEJM199002083220602]
  18. Moertel CG, Fleming TR, Macdonald JS, Haller DG, Laurie JA, Tangen CM, Ungerleider JS, Emerson WA, Tormey DC, Glick JH, Mailliard JA (1995) Fluorouracil plus levamisole as effective adjuvant therapy after resection of stage III colon carcinoma: a final report. Ann Intern Med 122(5):321–326 [DOI: 10.7326/0003-4819-122-5-199503010-00001]
  19. Moore KL, van der Laan MJ (2009) Application of time-to-event methods in the assessment of safety in clinical trials. In: Design and analysis of clinical trials with time-to-event endpoints. Taylor & Francis, London, pp 455–482
  20. Moore KL, van der Laan MJ (2009) Covariate adjustment in randomized trials with binary outcomes: targeted maximum likelihood estimation. Stat Med 28(1):39–64 [DOI: 10.1002/sim.3445]
  21. Moore KL, van der Laan MJ (2009) Increasing power in randomized trials with right censored outcomes through covariate adjustment. J Biopharm Stat 19(6):1099–1131 [DOI: 10.1080/10543400903243017]
  22. Murphy SA (1995) Likelihood ratio-based confidence intervals in survival analysis. J Am Stat Assoc 90(432):1399–1405 [DOI: 10.1080/01621459.1995.10476645]
  23. R Core Team (2020) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria https://www.R-project.org/
  24. Robins JM, Rotnitzky A (1992) Recovery of information and adjustment for dependent censoring using surrogate markers. In: AIDS epidemiology, Springer, Berlin pp 297–331
  25. Robins J (1986) A new approach to causal inference in mortality studies with a sustained exposure period-application to control of the healthy worker survivor effect. Math Model 7(9–12):1393–1512 [DOI: 10.1016/0270-0255(86)90088-6]
  26. Rytgaard HCW, Eriksson F, van der Laan MJ (2021b) Estimation of time-specific intervention effects on continuously distributed time-to-event outcomes by targeted maximum likelihood estimation. arXiv preprint arXiv:2106.11009
  27. Rytgaard HCW, van der Laan MJ (2021) One-step tmle to target cause-specific absolute risks and survival curves. arXiv preprint arXiv:2107.01537
  28. Rytgaard HC, Gerds TA, van der Laan MJ (2021a) Continuous-time targeted minimum loss-based estimation of intervention-specific mean outcomes. Annals of Statistics (just accepted)
  29. Simon N, Friedman J, Hastie T, Tibshirani R (2011) Regularization paths for cox’s proportional hazards model via coordinate descent. J Stat Softw 39(5):1 [DOI: 10.18637/jss.v039.i05]
  30. Stitelman OM, van der Laan MJ (2011) Targeted maximum likelihood estimation of time-to-event parameters with time-dependent covariates. Division of Biostatistics, University of California, Berkeley, Technical report
  31. Stitelman OM, Wester CW, De Gruttola V, van der Laan MJ (2011) Targeted maximum likelihood estimation of effect modification parameters in survival analysis. Int J Biostat 7(1):1–34 [DOI: 10.2202/1557-4679.1307]
  32. Therneau TM (2015) A package for survival analysis in S. https://CRAN.R-project.org/package=survival , version 2.38
  33. Tibshirani R (1997) The lasso method for variable selection in the cox model. Stat Med 16(4):385–395 [DOI: 10.1002/(SICI)1097-0258(19970228)16]
  34. van der Laan MJ (2006) Statistical inference for variable importance. Int J Biostat. https://doi.org/10.2202/1557-4679.1008 [DOI: 10.2202/1557-4679.1008]
  35. van der Laan MJ (2017) A generally efficient targeted minimum loss based estimator based on the highly adaptive lasso. Int J Biostat. https://doi.org/10.1515/ijb-2015-0097 [DOI: 10.1515/ijb-2015-0097]
  36. van der Laan MJ, Dudoit S (2003) Unified cross-validation methodology for selection among estimators and a general cross-validated adaptive epsilon-net estimator: finite sample oracle inequalities and examples
  37. van der Laan MJ, Gruber S (2012) Targeted minimum loss based estimation of causal effects of multiple time point interventions. Int J Biostat. https://doi.org/10.1515/1557-4679.1370 [DOI: 10.1515/1557-4679.1370]
  38. van der Laan M, Gruber S (2016) One-step targeted minimum loss-based estimation based on universal least favorable one-dimensional submodels. Int J Biostat 12(1):351–378 [DOI: 10.1515/ijb-2015-0054]
  39. van der Laan MJ, Robins JM (2003) Unified methods for censored longitudinal data and causality. Springer, Berlin [DOI: 10.1007/978-0-387-21700-0]
  40. van der Laan MJ, Rubin D (2006) Targeted maximum likelihood learning. Int J Biostat 2(1):1043–1043
  41. van der Laan MJ, Rose S (2011) Targeted learning: causal inference for observational and experimental data. Springer, Berlin [DOI: 10.1007/978-1-4419-9782-1]
  42. van der Laan MJ, Rose S (2018) Targeted learning in data science: causal inference for complex longitudinal studies. Springer, Berlin [DOI: 10.1007/978-3-319-65304-4]
  43. van der Laan MJ, Dudoit S, Keles S (2004) Asymptotic optimality of likelihood-based cross-validation. Stat Appl Genet Mol Biol. https://doi.org/10.2202/1544-6115.1036 [DOI: 10.2202/1544-6115.1036]
  44. van der Laan MJ, Polley EC, Hubbard AE (2007) Super learner. Stat Appl Genet Mol Biol. https://doi.org/10.2202/1544-6115.1309 [DOI: 10.2202/1544-6115.1309]
  45. van der Vaart AW (2000) Asymptotic statistics, vol 3. Cambridge University Press, Cambridge
  46. van der Vaart AW, Wellner JA (1996) Weak convergence. In: Weak convergence and empirical processes. Springer, Berlin, pp 16–28
  47. van der Vaart AW, Dudoit S, van der Laan MJ (2006) Oracle inequalities for multi-fold cross validation. Stat Decis 24(3):351–371 [DOI: 10.1524/stnd.2006.24.3.351]
  48. Westling T, van der Laan MJ, Carone M (2020) Correcting an estimator of a multivariate monotone function with isotonic regression. Electr J Stat 14(2):3032–3069

MeSH Term

Humans
Likelihood Functions
Algorithms
Software
Causality
Survival Analysis

Word Cloud

Created with Highcharts 10.0.0estimationTMLEcausalinferencesurvivalrisksparameterscompetinganalysisTargetedmaximumlikelihoodgeneralnuisanceefficiencyeffectstreatmentusinglearningconsideredavailableprovidesmethodologypresencehigh-dimensionalGenerallyconsiststwo-stepprocedurecombinesdata-adaptiveparametersemiparametricrigorousstatisticalobtainedviatargetedupdatesteppaperdemonstratepracticalapplicabilitybasedsettingseventtimesconfinedtakeplacediscretefinitegridfocustime-fixeddecisionsabsoluteriskprobabilitiesconsideringdifferentunivariatemultidimensionalBesidesprovidingguidancedescribepreviousworkcanextendeduseloss-basedcross-validatedalsoknownsuperconditionalhazardsillustrateusagemethodspubliclydatatrialadjuvantchemotherapycoloncancerRsoftwarecodeimplementalgorithmsreproduceanalysesaccompanyingonlineappendixGithubAverageCausalCompetingHighlyadaptivelassoSemiparametricSuperSurvival

Similar Articles

Cited By