Incorporating longitudinal biomarkers for dynamic risk prediction in the era of big data: A pseudo-observation approach.

Lili Zhao, Susan Murray, Laura H Mariani, Wenjun Ju
Author Information
  1. Lili Zhao: Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA. ORCID
  2. Susan Murray: Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA.
  3. Laura H Mariani: Department of Internal Medicine/Nephrology, University of Michigan, Ann Arbor, Michigan, USA.
  4. Wenjun Ju: Division of Nephrology, University of Michigan, Ann Arbor, Michigan, USA.

Abstract

Longitudinal biomarker data are often collected in studies, providing important information regarding the probability of an outcome of interest occurring at a future time. With many new and evolving technologies for biomarker discovery, the number of biomarker measurements available for analysis of disease progression has increased dramatically. A large amount of data provides a more complete picture of a patient's disease progression, potentially allowing us to make more accurate and reliable predictions, but the magnitude of available data introduces challenges to most statistical analysts. Existing approaches suffer immensely from the curse of dimensionality. In this article, we propose methods for making dynamic risk predictions using repeatedly measured biomarkers of a large dimension, including cases when the number of biomarkers is close to the sample size. The proposed methods are computationally simple, yet sufficiently flexible to capture complex relationships between longitudinal biomarkers and potentially censored events times. The proposed approaches are evaluated by extensive simulation studies and are further illustrated by an application to a data set from the Nephrotic Syndrome Study Network.

Keywords

References

  1. Biom J. 2006 Dec;48(6):1029-40 [PMID: 17240660]
  2. BMC Genet. 2004 Dec 10;5:32 [PMID: 15588316]
  3. Lifetime Data Anal. 2008 Dec;14(4):447-63 [PMID: 18836831]
  4. Lifetime Data Anal. 2014 Apr;20(2):303-15 [PMID: 23430270]
  5. Stat Methods Med Res. 2016 Aug;25(4):1346-58 [PMID: 23592717]
  6. J Clin Oncol. 2013 Jun 10;31(17):2110-4 [PMID: 23650411]
  7. Stat Med. 2004 Mar 30;23(6):859-74; discussion 875-7,879-80 [PMID: 15027075]
  8. BMC Med Res Methodol. 2016 Sep 07;16(1):117 [PMID: 27604810]
  9. Biostatistics. 2009 Jul;10(3):535-49 [PMID: 19369642]
  10. Stat Med. 2014 Feb 20;33(4):580-94 [PMID: 24009073]
  11. Biostatistics. 2002 Mar;3(1):33-50 [PMID: 12933622]
  12. Biometrics. 2011 Sep;67(3):819-29 [PMID: 21306352]
  13. Biometrics. 2017 Mar;73(1):83-93 [PMID: 27438160]
  14. Biometrics. 2008 Jun;64(2):603-10 [PMID: 17764480]
  15. Stat Med. 2011 May 30;30(12):1366-80 [PMID: 21337596]
  16. Biometrics. 2005 Mar;61(1):92-105 [PMID: 15737082]
  17. Biometrics. 1988 Dec;44(4):1049-60 [PMID: 3233245]
  18. J Clin Oncol. 2009 Sep 1;27(25):4103-8 [PMID: 19636014]
  19. Ann Appl Stat. 2010 Sep 1;4(3):1517-1532 [PMID: 21938267]
  20. Kidney Int. 2013 Apr;83(4):749-56 [PMID: 23325076]
  21. Stat Med. 2017 Jul 10;36(15):2435-2451 [PMID: 28324918]
  22. J Appl Stat. 2014 Jan 1;41(10):2192-2205 [PMID: 25214700]
  23. Stat Methods Med Res. 2010 Feb;19(1):71-99 [PMID: 19654170]
  24. Stat Med. 2014 Oct 30;33(24):4279-91 [PMID: 24935619]
  25. Stat Med. 2012 Mar 15;31(6):561-76 [PMID: 22238131]
  26. Clin J Am Soc Nephrol. 2013 Aug;8(8):1449-59 [PMID: 23393107]
  27. Biostatistics. 2008 Jul;9(3):419-31 [PMID: 18056686]
  28. Biometrics. 2005 Mar;61(1):223-9 [PMID: 15737097]
  29. Cancer Chemother Pharmacol. 2010 Nov;66(6):1141-9 [PMID: 20872147]
  30. J Biopharm Stat. 2011 Sep;21(5):971-91 [PMID: 21830926]
  31. Stat Med. 2015 May 10;34(10):1733-46 [PMID: 25630845]
  32. Comput Methods Programs Biomed. 2008 Mar;89(3):289-300 [PMID: 18199521]

Grants

  1. P30 CA046592/NCI NIH HHS
  2. U54 DK083912/NIDDK NIH HHS
  3. UL1 TR002240/NCATS NIH HHS

MeSH Term

Big Data
Biomarkers
Computer Simulation
Disease Progression
Humans
Longitudinal Studies
Probability
Sample Size

Chemicals

Biomarkers

Word Cloud

Created with Highcharts 10.0.0databiomarkersbiomarkerdynamicriskpredictionstudiesnumberavailablediseaseprogressionlargepotentiallypredictionsapproachesmethodsproposedlongitudinalLongitudinaloftencollectedprovidingimportantinformationregardingprobabilityoutcomeinterestoccurringfuturetimemanynewevolvingtechnologiesdiscoverymeasurementsanalysisincreaseddramaticallyamountprovidescompletepicturepatient'sallowingusmakeaccuratereliablemagnitudeintroduceschallengesstatisticalanalystsExistingsufferimmenselycursedimensionalityarticleproposemakingusingrepeatedlymeasureddimensionincludingcasesclosesamplesizecomputationallysimpleyetsufficientlyflexiblecapturecomplexrelationshipscensoredeventstimesevaluatedextensivesimulationillustratedapplicationsetNephroticSyndromeStudyNetworkIncorporatingerabigdata:pseudo-observationapproachJointmodelingpseudoobservationsrandomforests

Similar Articles

Cited By