Estimating treatment effects under untestable assumptions with nonignorable missing data.

Manuel Gomes, Michael G Kenward, Richard Grieve, James Carpenter
Author Information
  1. Manuel Gomes: Department of Applied Health Research, University College London, London, UK. ORCID
  2. Michael G Kenward: Department of Medical Statistics, LSHTM, London, UK.
  3. Richard Grieve: Department of Health Services Research and Policy, LSHTM, London, UK.
  4. James Carpenter: Department of Medical Statistics, LSHTM, London, UK.

Abstract

Nonignorable missing data poses key challenges for estimating treatment effects because the substantive model may not be identifiable without imposing further assumptions. For example, the Heckman selection model has been widely used for handling nonignorable missing data but requires the study to make correct assumptions, both about the joint distribution of the missingness and outcome and that there is a valid exclusion restriction. Recent studies have revisited how alternative selection model approaches, for example estimated by multiple imputation (MI) and maximum likelihood, relate to Heckman-type approaches in addressing the first hurdle. However, the extent to which these different selection models rely on the exclusion restriction assumption with nonignorable missing data is unclear. Motivated by an interventional study (REFLUX) with nonignorable missing outcome data in half of the sample, this article critically examines the role of the exclusion restriction in Heckman, MI, and full-likelihood selection models when addressing nonignorability. We explore the implications of the different methodological choices concerning the exclusion restriction for relative bias and root-mean-squared error in estimating treatment effects. We find that the relative performance of the methods differs in practically important ways according to the relevance and strength of the exclusion restriction. The full-likelihood approach is less sensitive to alternative assumptions about the exclusion restriction than Heckman-type models and appears an appropriate method for handling nonignorable missing data. We illustrate the implications of method choice for inference in the REFLUX study, which evaluates the effect of laparoscopic surgery on long-term quality of life for patients with gastro-oseophageal reflux disease.

Keywords

References

  1. Mattei A, Mealli F, Pacini B. Identification of causal effects in the presence of nonignorable missing outcome values. Biometrics. 2014;70:278-288.
  2. Faria R, Gomes M, Epstein D, White IR. A guide to handling missing data in cost-effectiveness analysis conducted within randomised controlled trials. Pharmacoeconomics. 2014;32:1157-1170.
  3. Mason A, Gomes M, Grieve R, Ulug P, Powell J, Carpenter J. Development of a practical approach to expert elicitation for randomised controlled trials with missing health outcomes: application to the IMPROVE Trial. Clin Trials. 2017;14:357-367.
  4. Heckman JJ. Sample selection bias as a specification error. Econometrica. 1979;47:153-161.
  5. Diggle P, Kenward MG. Informative drop-out in longitudinal data-analysis. J Royal Stat Soc Ser C-Appl Stat. 1994;43:49-93.
  6. Daniels M, Hogan J. Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis. Chapman and Hall / CRC: Boca Raton, FL; 2008.
  7. Galimard JE, Chevret S, Protopopescu C, Resche-Rigon M. A multiple imputation approach for MNAR mechanisms compatible with Heckman's model. Stat Med. 2016;35:2907-2920.
  8. Vella F. Estimating models with sample selection bias: a survey. J Hum Res. 1998;33:127-169.
  9. Das M, Newey WK, Vella F. Nonparametric estimation of sample selection models. Rev Econ Stud. 2003;70:33-58.
  10. Pigini C. Bivariate non-normality in the sample selection model. J Econ Methods. 2015;4:123-144.
  11. Zhelonkin M, Genton MG, Ronchetti E. Robust inference in sample selection models. J Royal Stat Soc Ser B. 2016;78:805-827.
  12. Mohan K, Pearl J. On the testability of models with missing data. Proceedings of Artificial Intelligence and Statistics. 2014;33:643-650.
  13. Puhani PA. The Heckman correction for sample selection and its critique. J Econ Surv. 2000;14:53-68.
  14. Little RJ, Rubin DB. Statistical Analysis with Missing Data. Wiley Series in Probability and Mathematical Statistics. Wiley: New York, NY; 2002.
  15. Molenberghs G, Fitzmaurice GM, Kenward M, Tsiatis AA, Verbeke G. Handbook of Missing Data Methodology. Chapman & Hall / CRC: Boca Raton, FL; 2014.
  16. Grant AM, Boachie C, Cotton SC, et al. Clinical and economic evaluation of laparoscopic surgery compared with medical management for gastro-oesophageal reflux disease: 5-year follow-up of multicentre randomised trial (the REFLUX trial). Health Technol Assess. 2013;17:1-167.
  17. Gomes M, Gutacker N, Bojke C, Street A. Addressing missing data in Patient-Reported Outcome Measures (PROMs): implications for the use of PROMs for comparing provider performance. Health Econ. 2016;25:515-528.
  18. EuroQol-a new facility for the measurement of health-related quality of life. Health Policy. 1990;16:199-208.
  19. Meng XL. Multiple-imputation inferences with uncongenial sources of input. Stat Sci. 1994;9:538-558.
  20. Carpenter J, Kenward M. Multiple imputation and its application. Statistics in Practice. Chichester, NH: Wiley; 2013.
  21. Rubin DB. Multiple Imputation for Nonresponse in Surveys. Wiley Series in Probability and Mathematical Statistics. New York, NY: Wiley; 1987.
  22. White IR, Royston P, Wood AM. Multiple imputation using chained equations: issues and guidance for practice. Stat Med. 2011;30:377-399.
  23. Gomes M, Rosalba R, Camarena Brenes J, Giampiero M. Copula selection models for non-Gaussian outcomes that are missing not at random. Stat Med. 2019;38:480-496.
  24. Sales AE, Plomondon ME, Magid DJ, Spertus JA, Rumsfeld JS. Assessing response bias from missing quality of life data: the Heckman method. Health Qual Life Outcomes. 2004;2:49.
  25. Alva M, Gray A, Mihaylova B, Clarke P. The effect of diabetes complications on health-related quality of life: the importance of longitudinal data to address patient heterogeneity. Health Econ. 2014;23:487-500.
  26. Washbrook E, Clarke PS, Steele F. Investigating non-ignorable dropout in panel studies of residential mobility. J Royal Stat Soc Ser C-Appl Stat. 2014;63:239-266.
  27. Tseng CH, Elashoff R, Li N, Li G. Longitudinal data analysis with non-ignorable missing data. Stat Methods Med Res. 2016;25:205-220.
  28. Toomet O, Henningsen A. Sample selection models in R: package sampleSelection. J Stat Softw. 2008;27:1-23.
  29. Mason A, Richardson S, Plewis I, Best N. Strategy for modelling nonrandom missing data mechanisms in observational studies using Bayesian methods. J Off Stat. 2012;28:279-302.
  30. Plummer M. JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling. Paper presented at: Proceedings of the 3rd International Workshop on Distributed Statistical Computing; 2003:1-10.
  31. Del Bianco P, Borgoni R. Handling dropout and clustering in longitudinal multicentre clinical trials. Stat Model. 2006;6:141-157.
  32. Marchenko YV, Genton MG. A Heckman selection-t model. J Am Stat Assoc. 2012;107:304-317.
  33. McGovern M, Bärnighausen T, Marra G, Radice R. On the assumption of bivariate normality in selection models: a copula approach applied to estimating HIV prevalence. Epidemiology. 2015;26:229-237.
  34. Clarke S, Houle B. Evaluation of Heckman selection model method for correcting estimates of HIV prevalence from sample surveys. Center for Statistics and the Social Sciences Working Paper no. 120. 2012.
  35. Kenward MG. Selection models for repeated measurements with non-random dropout: an illustration of sensitivity. Stat Med. 1998;17:2723-2732.

Grants

  1. MC_UU_12023/21/Medical Research Council

MeSH Term

Bias
Gastroesophageal Reflux
Humans
Likelihood Functions
Models, Statistical
Quality of Life

Word Cloud

Created with Highcharts 10.0.0missingdataexclusionrestrictionselectionnonignorabletreatmenteffectsmodelassumptionsmodelsHeckmanstudyestimatingexamplehandlingoutcomealternativeapproachesmultipleimputationMImaximumlikelihoodHeckman-typeaddressingdifferentREFLUXfull-likelihoodimplicationsrelativemethodNonignorableposeskeychallengessubstantivemayidentifiablewithoutimposingwidelyusedrequiresmakecorrectjointdistributionmissingnessvalidRecentstudiesrevisitedestimatedrelatefirsthurdleHoweverextentrelyassumptionunclearMotivatedinterventionalhalfsamplearticlecriticallyexaminesrolenonignorabilityexploremethodologicalchoicesconcerningbiasroot-mean-squarederrorfindperformancemethodsdifferspracticallyimportantwaysaccordingrelevancestrengthapproachlesssensitiveappearsappropriateillustratechoiceinferenceevaluateseffectlaparoscopicsurgerylong-termqualitylifepatientsgastro-oseophagealrefluxdiseaseEstimatinguntestableaveragefull-informationrandom

Similar Articles

Cited By (1)