Scalable Algorithms for Large Competing Risks Data.

Eric S Kawaguchi, Jenny I Shen, Marc A Suchard, Gang Li
Author Information
  1. Eric S Kawaguchi: Department of Preventive Medicine, University of Southern California.
  2. Jenny I Shen: Division of Nephrology and Hypertension Los Angeles Biomedical Institute at Harbor-UCLA Medical Center.
  3. Marc A Suchard: Department of Biostatistics, University of California, Los Angeles.
  4. Gang Li: Department of Biostatistics, University of California, Los Angeles.

Abstract

This paper develops two orthogonal contributions to scalable sparse regression for competing risks time-to-event data. First, we study and accelerate the broken adaptive ridge method (BAR), a surrogate -based iteratively reweighted -penalization algorithm that achieves sparsity in its limit, in the context of the Fine-Gray (1999) proportional subdistributional hazards (PSH) model. In particular, we derive a new algorithm for BAR regression, named cycBAR, that performs cyclic update of each coordinate using an explicit thresholding formula. The new cycBAR algorithm effectively avoids fitting multiple reweighted -penalizations and thus yields impressive speedups over the original BAR algorithm. Second, we address a pivotal computational issue related to fitting the PSH model. Specifically, the computation costs of the log-pseudo likelihood and its derivatives for PSH model grow at the rate of ( ) with the sample size in current implementations. We propose a novel forward-backward scan algorithm that reduces the computation costs to (). The proposed method applies to both unpenalized and penalized estimation for the PSH model and has exhibited drastic speedups over current implementations. Finally, combining the two algorithms can yields > 1, 000 fold speedups over the original BAR algorithm. Illustrations of the impressive scalability of our proposed algorithm for large competing risks data are given using both simulations and a United States Renal Data System data. Supplementary materials for this article are available online.

Keywords

References

  1. Epidemiology. 2009 Jul;20(4):555-61 [PMID: 19367167]
  2. J Stat Softw. 2010;33(1):1-22 [PMID: 20808728]
  3. Philos Trans A Math Phys Eng Sci. 2018 Sep 13;376(2128): [PMID: 30082302]
  4. J Am Stat Assoc. 2020;115(529):204-216 [PMID: 32742044]
  5. Stat Med. 2014 Nov 20;33(26):4590-604 [PMID: 25042872]
  6. J Am Soc Nephrol. 2016 Aug;27(8):2511-8 [PMID: 26848153]
  7. J Am Stat Assoc. 2012 Jan 1;107(497):223-232 [PMID: 22736876]
  8. Stud Health Technol Inform. 2015;216:574-8 [PMID: 26262116]
  9. Stat Med. 2007 May 20;26(11):2389-430 [PMID: 17031868]
  10. Biostatistics. 2014 Apr;15(2):207-21 [PMID: 24096388]
  11. Stat Med. 2020 Mar 15;39(6):675-686 [PMID: 31814146]
  12. Lifetime Data Anal. 2018 Jul;24(3):407-424 [PMID: 28779228]
  13. N Engl J Med. 1991 Jan 31;324(5):302-7 [PMID: 1898431]
  14. JAMA. 2018 Jan 2;319(1):49-61 [PMID: 29297077]
  15. J Multivar Anal. 2018 Nov;168:334-351 [PMID: 30911202]
  16. Biostatistics. 2021 Apr 10;22(2):381-401 [PMID: 31545341]
  17. Can J Stat. 2018 Sep;46(3):416-428 [PMID: 32999527]
  18. Lifetime Data Anal. 2017 Jul;23(3):353-376 [PMID: 27016934]
  19. N Engl J Med. 1999 Dec 2;341(23):1725-30 [PMID: 10580071]
  20. BMC Nephrol. 2016 Jul 26;17:95 [PMID: 27456350]
  21. Stat Med. 2018 Oct 30;37(24):3486-3502 [PMID: 29845637]
  22. Clin J Am Soc Nephrol. 2011 Jul;6(7):1760-7 [PMID: 21597030]
  23. ACM Trans Model Comput Simul. 2013 Jan;23(1): [PMID: 25328363]
  24. Clin J Am Soc Nephrol. 2008 Mar;3(2):463-70 [PMID: 18199847]
  25. Biometrics. 1978 Dec;34(4):541-54 [PMID: 373811]
  26. J Am Soc Nephrol. 2009 Jun;20(6):1333-40 [PMID: 19339381]

Grants

  1. U19 AI135995/NIAID NIH HHS
  2. K23 DK103972/NIDDK NIH HHS
  3. UL1 TR000124/NCATS NIH HHS
  4. P30 CA016042/NCI NIH HHS
  5. P50 CA211015/NCI NIH HHS
  6. UL1 TR001881/NCATS NIH HHS

Word Cloud

Created with Highcharts 10.0.0algorithmmodelBARPSHdataspeedupstworegressioncompetingrisksmethodreweightedFine-GraynewcycBARusingfittingyieldsimpressiveoriginalcomputationcostscurrentimplementationsproposedDatapaperdevelopsorthogonalcontributionsscalablesparsetime-to-eventFirststudyacceleratebrokenadaptiveridgesurrogate-basediteratively-penalizationachievessparsitylimitcontext1999proportionalsubdistributionalhazardsparticularderivenamedperformscyclicupdatecoordinateexplicitthresholdingformulaeffectivelyavoidsmultiple-penalizationsthusSecondaddresspivotalcomputationalissuerelatedSpecificallylog-pseudolikelihoodderivativesgrowratesamplesizeproposenovelforward-backwardscanreducesappliesunpenalizedpenalizedestimationexhibiteddrasticFinallycombiningalgorithmscan>1000foldIllustrationsscalabilitylargegivensimulationsUnitedStatesRenalSystemSupplementarymaterialsarticleavailableonlineScalableAlgorithmsLargeCompetingRisksBrokenAdaptiveRidgeMassiveSampleSizeModelSelection/VariableselectionOraclepropertySubdistributionhazardℓ0-regularization

Similar Articles

Cited By