A review of survival stacking: a method to cast survival regression analysis as a classification problem.

Erin Craig, Chenyang Zhong, Robert Tibshirani
Author Information
  1. Erin Craig: Department of Biomedical Data Science, 6429 Stanford University , Stanford, USA. ORCID
  2. Chenyang Zhong: Department of Statistics, Columbia University, New York, USA.
  3. Robert Tibshirani: Departments of Biomedical Data Science and Statistics, 6429 Stanford University , Stanford, USA.

Abstract

While there are many well-developed data science methods for classification and regression, there are relatively few methods for working with right-censored data. Here, we review survival stacking, a method for casting a survival regression analysis problem as a classification problem, thereby allowing the use of general classification methods and software in a survival setting. Inspired by the Cox partial likelihood, survival stacking collects features and outcomes of survival data in a large data frame with a binary outcome. We show that survival stacking with logistic regression is approximately equivalent to the Cox proportional hazards model. We further illustrate survival stacking on real and simulated data. By reframing survival regression problems as classification problems, survival stacking removes the reliance on specialized tools for survival regression, and makes it straightforward for data scientists to use well-known learning algorithms and software for classification in the survival setting. This in turn lowers the barrier for flexible survival modeling.

Keywords

References

  1. Cox, DR. Regression models and life-tables. J R Stat Soc B: Stat Methodol 1972;34:187���202. https://doi.org/10.1111/j.2517-6161.1972.tb00899.x . [DOI: 10.1111/j.2517-6161.1972.tb00899.x]
  2. Wei, L-J. The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. Stat Med 1992;11:1871���9. https://doi.org/10.1002/sim.4780111409 . [DOI: 10.1002/sim.4780111409]
  3. Gray, RJ. Flexible methods for analyzing survival data using splines, with applications to breast cancer prognosis. J Am Stat Assoc 1992;87:942���51. https://doi.org/10.2307/2290630 . [DOI: 10.2307/2290630]
  4. Royston, P, Altman, DG. External validation of a Cox prognostic model: principles and methods. BMC Med Res Methodol 2013;13:1���15. https://doi.org/10.1186/1471-2288-13-33 . [DOI: 10.1186/1471-2288-13-33]
  5. D���Agostino, RB, Lee, M-L, Belanger, AJ, Adrienne Cupples, L, Anderson, K, Kannel, WB. Relation of pooled logistic regression to time dependent Cox regression analysis: the Framingham Heart Study. Stat Med 1990;9:1501���15. https://doi.org/10.1002/sim.4780091214 . [DOI: 10.1002/sim.4780091214]
  6. Ingram, DD, Kleinman, JC. Empirical comparisons of proportional hazards and logistic regression models. Stat Med 1989;8:525���38. https://doi.org/10.1002/sim.4780080502 . [DOI: 10.1002/sim.4780080502]
  7. Lim, M, Hastie, T. Learning interactions via hierarchical group-lasso regularization. J Comput Graph Stat 2015;24:627���54. https://doi.org/10.1080/10618600.2014.938812 . [DOI: 10.1080/10618600.2014.938812]
  8. Harrell, FE, Califf, RM, Pryor, DB, Lee, KL, Rosati, RA. Evaluating the yield of medical tests. JAMA 1982;247:2543���6. https://doi.org/10.1001/jama.1982.03320430047030 . [DOI: 10.1001/jama.1982.03320430047030]
  9. Uno, H, Cai, T, Lu, T, Wei, L-J. Evaluating prediction rules for t-year survivors with censored regression models. J Am Stat Assoc 2007;102:527���37. https://doi.org/10.1198/016214507000000149 . [DOI: 10.1198/016214507000000149]
  10. Mogensen, UB, Ishwaran, H, Gerds, TA. Evaluating random forests for survival analysis using prediction error curves. J Stat Software 2012;50:1. https://doi.org/10.18637/jss.v050.i11 . [DOI: 10.18637/jss.v050.i11]
  11. Wu, M, Ware, JH. On the use of repeated measurements in regression analysis with dichotomous responses. Biometrics 1979:513���21. https://doi.org/10.2307/2530355 . [DOI: 10.2307/2530355]
  12. Adrienne Cupples, L, D���Agostino, RB, Anderson, K, Kannel, WB. Comparison of baseline and repeated measure covariate techniques in the Framingham Heart Study. Stat Med 1988;7:205���18. https://doi.org/10.1002/sim.4780070122 . [DOI: 10.1002/sim.4780070122]
  13. Therneau, TM, Grambsch, PM. Modeling survival data: extending the Cox model , 1st ed. New York: Springer; 2000:50���3 pp.
  14. Tutz, G, Schmid, M. Modeling discrete time-to-event data . Cham: Springer; 2016.
  15. Allison, PD. Discrete-time methods for the analysis of event histories. Socio Methodol 1982;13:61���98. https://doi.org/10.2307/270718 . [DOI: 10.2307/270718]
  16. Polley, EC, van der Laan, MJ. Super learning for right-censored data . In: Targeted learning: causal inference for observational and experimental data . New York: Springer; 2011:249���58 pp.
  17. Fahrmeir, L. Discrete survival-time models. In: Wiley StatsRef: statistics reference online ; 2014.
  18. Moore, KL, van der Laan, MJ. Covariate adjustment in randomized trials with binary outcomes: targeted maximum likelihood estimation. Stat Med 2009;28:39���64. https://doi.org/10.1002/sim.3445 . [DOI: 10.1002/sim.3445]
  19. Stitelman, OM, De Gruttola, V, van der Laan, MJ. A general implementation of tmle for longitudinal data applied to causal inference in survival analysis. Int J Biostat 2012;8. https://doi.org/10.1515/1557-4679.1334 . [DOI: 10.1515/1557-4679.1334]
  20. Cai, W, van der Laan, MJ. One-step targeted maximum likelihood estimation for time-to-event outcomes. Biometrics 2020;76:722���33. https://doi.org/10.1111/biom.13172 . [DOI: 10.1111/biom.13172]
  21. Rytgaard, HCW, van der Laan, MJ. Targeted maximum likelihood estimation for causal inference in survival and competing risks analysis. In: Lifetime data analysis ; 2022:1���30 pp.
  22. Fewell, Z, Hern��n, MA, Wolfe, F, Tilling, K, Choi, H, Sterne, JAC. Controlling for time-dependent confounding using marginal structural models. Stata J 2004;4:402���20. https://doi.org/10.1177/1536867x0400400403 . [DOI: 10.1177/1536867x0400400403]
  23. Benkeser, D, Gilbert, PB, Carone, M. Estimating and testing vaccine sieve effects using machine learning. J Am Stat Assoc 2019;114:1038���49. https://doi.org/10.1080/01621459.2018.1529594 . [DOI: 10.1080/01621459.2018.1529594]
  24. Ching, T, Zhu, X, Garmire, LX. Cox-nnet: an artificial neural network method for prognosis prediction of high-throughput omics data. PLoS Comput Biol 2018;14:e1006076. https://doi.org/10.1371/journal.pcbi.1006076 . [DOI: 10.1371/journal.pcbi.1006076]
  25. Katzman, JL, Shaham, U, Cloninger, A, Bates, J, Jiang, T, Kluger, Y. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol 2018;18:1���12. https://doi.org/10.1186/s12874-018-0482-1 . [DOI: 10.1186/s12874-018-0482-1]
  26. Giunchiglia, E, Nemchenko, A, van der Schaar, M. Rnn-surv: a deep recurrent model for survival analysis. In: International conference on artificial neural networks . Springer; 2018:23���32 pp.
  27. Gensheimer, MF, Narasimhan, B. A scalable discrete-time survival model for neural networks. PeerJ 2019;7:e6257. https://doi.org/10.7717/peerj.6257 . [DOI: 10.7717/peerj.6257]
  28. Caruana, R. Multitask learning. Mach Learn 1997;28:41���75. https://doi.org/10.1023/a:1007379606734 .
  29. Yu, C-N, Greiner, R, Lin, H-C, Baracos, V. Learning patient-specific cancer survival distributions as a sequence of dependent regressors. Adv Neural Inf Process Syst 2011;24:1845���53.
  30. Fotso, S. Deep neural networks for survival analysis based on a multi-task framework. arXiv preprint arXiv:1801.05512 2018.
  31. Alexander Gerds, T, Sebastian Ohlendorff, J, Ozenne, B. riskRegression: risk regression models and prediction scores for survival analysis with competing risks . R Package Version 2023.03.22; 2023.
  32. Gerds, TA, Kattan, MW. Medical risk prediction models: with ties to machine learning , 1st ed. New York: Chapman and Hall/CRC; 2021.
  33. Rindt, D, Hu, R, Steinsaltz, D, Sejdinovic, D. Survival regression with proper scoring rules and monotonic neural networks. In: International conference on artificial intelligence and statistics . PMLR; 2022:1190���205 pp.
  34. Ishwaran, H, Kogalur, UB, Blackstone, EH, Lauer, MS. Random survival forests. Ann Appl Stat 2008;2:841���60. https://doi.org/10.1214/08-aoas169 . [DOI: 10.1214/08-aoas169]
  35. Freund, Y, Schapire, R, Abe, N. A short introduction to boosting. J Jpn Soc Artif Intell 1999;14:1612.
  36. Friedman, JH. Greedy function approximation: a gradient boosting machine. Ann Stat 2001:1189���232. https://doi.org/10.1214/aos/1013203451 . [DOI: 10.1214/aos/1013203451]
  37. Brilleman, SL, Wolfe, R, Moreno-Betancur, M, Crowther, MJ. Simulating survival data using the simsurv R package. J Stat Software 2020;97:1���27. https://doi.org/10.18637/jss.v097.i03 . [DOI: 10.18637/jss.v097.i03]
  38. Breslow, NE. Discussion of the paper by DR Cox. J Roy Stat Soc B 1972;34:216���17.
  39. Simon, N, Friedman, J, Hastie, T, Tibshirani, R. Regularization paths for Cox���s proportional hazards model via coordinate descent. J Stat Software 2011;39:1���13. https://doi.org/10.18637/jss.v039.i05 . [DOI: 10.18637/jss.v039.i05]
  40. Therneau, TM. A package for survival analysis in R . R Package Version 3.2-7; 2020.
  41. Jackson, C. flexsurv: a platform for parametric survival modeling in R. J Stat Software 2016;70:1���33. https://doi.org/10.18637/jss.v070.i08 . [DOI: 10.18637/jss.v070.i08]
  42. Jaeger, BC, Welden, S, Lenoir, K, Pajewski, NM. aorsf: an r package for supervised learning using the oblique random survival forest. J Open Source Softw 2022;7:4705. https://doi.org/10.21105/joss.04705 . [DOI: 10.21105/joss.04705]
  43. Tibshirani, J, Athey, S, Wager, S. grf: generalized random forests . R Package Version 1.2.0; 2020.
  44. Yao, W, Frydman, H, Larocque, D, Simonoff, JS. LTRCforests: ensemble methods for survival data with time-varying covariates . R Package Version 0.5.5; 2021.
  45. Hothorn, T, Zeileis, A. partykit: a modular toolkit for recursive partytioning in R. J Mach Learn Res 2015;16:3905���9.
  46. Wright, MN, Ziegler, A. ranger: a fast implementation of random forests for high dimensional data in C++ and R. J Stat Software 2017;77:1���17. https://doi.org/10.18637/jss.v077.i01 . [DOI: 10.18637/jss.v077.i01]
  47. Therneau, T, Atkinson, B. rpart: recursive partitioning and regression trees . R Package Version 4.1.16; 2022.
  48. Binder, H. CoxBoost: Cox models by likelihood based boosting for a single survival endpoint or competing risks . R Package Version 1.5; 2023.
  49. Greenwell, B, Boehmke, B, Cunningham, J, Developers, GBM. gbm: generalized boosted regression models . R Package Version 2.1.8; 2020.
  50. Hothorn, T, Buehlmann, P, Kneib, T, Schmid, M, Hofner, B. mboost: model-based boosting . R Package Version 2.9-7; 2022.
  51. Chen, T, He, T, Benesty, M, Khotilovich, V, Tang, Y, Cho, H, et al.. xgboost: extreme gradient boosting . R Package Version 2.0.0.1; 2022.
  52. Fouodo, CJK. survivalsvm: survival support vector analysis . R Package Version 0.0.5; 2018.
  53. Sonabend, R. survivalmodels: models for survival analysis . R Package Version 0.1.8; 2021.
  54. Sonabend, R, Kir��ly, FJ, Bender, A, Bischl, B, Lang, M. mlr3proba: an R package for machine learning in survival analysis. Bioinformatics 2021;37:2789���91. https://doi.org/10.1093/bioinformatics/btab039 . [DOI: 10.1093/bioinformatics/btab039]
  55. Davidson-Pilon, C. lifelines: survival analysis in python. J Open Source Softw 2019;4:1317. https://doi.org/10.21105/joss.01317 . [DOI: 10.21105/joss.01317]
  56. Fotso S and others . PySurvival: open source package for survival analysis modeling ; 2019. Available from: https://www.pysurvival.io/ .
  57. P��lsterl, S. scikit-survival: a library for time-to-event analysis built on top of scikit-learn. J Mach Learn Res 2020;21:1���6.
  58. Kvamme, H, Borgan, ��, Scheel, I. Time-to-event prediction with neural networks and Cox regression. arXiv preprint arXiv:1907.00825 2019.
  59. Welchowski, T, Schmid, M. discsurv: discrete time survival analysis . R Package Version; 2015, vol 1:1 p.
  60. Bender, A, Groll, A, Scheipl, F. A generalized additive model approach to time-to-event analysis. Stat Model Int J 2018;18:299���321. https://doi.org/10.1177/1471082x17748083 . [DOI: 10.1177/1471082x17748083]

Word Cloud

Created with Highcharts 10.0.0survivalregressiondataclassificationstackingmethodsanalysisproblemCoxreviewmethodusesoftwaresettingproblemsmanywell-developedsciencerelativelyworkingright-censoredcastingtherebyallowinggeneralInspiredpartiallikelihoodcollectsfeaturesoutcomeslargeframebinaryoutcomeshowlogisticapproximatelyequivalentproportionalhazardsmodelillustraterealsimulatedreframingremovesreliancespecializedtoolsmakesstraightforwardscientistswell-knownlearningalgorithmsturnlowersbarrierflexiblemodelingstacking:castcensored

Similar Articles

Cited By