What difference does multiple imputation make in longitudinal modeling of EQ-5D-5L data? Empirical analyses of simulated and observed missing data patterns.
Inka Rösel, Lina María Serna-Higuita, Fatima Al Sayah, Maresa Buchholz, Ines Buchholz, Thomas Kohlmann, Peter Martus, You-Shan Feng
Author Information
Inka Rösel: Institute for Clinical Epidemiology and Applied Biostatistics, Medical University of Tübingen, Silcherstraße 5, 72076, Tübingen, Germany. ORCID
Lina María Serna-Higuita: Institute for Clinical Epidemiology and Applied Biostatistics, Medical University of Tübingen, Silcherstraße 5, 72076, Tübingen, Germany. Lina.serna-higuita@med.uni-tuebingen.de. ORCID
Fatima Al Sayah: Alberta PROMs and EQ-5D Research and Support Unit (APERSU), School of Public Health, University of Alberta, Alberta, Canada.
Maresa Buchholz: Institute for Nursing Science and Interprofessional Education, Medical University Greifswald, Greifswald, Germany. ORCID
Ines Buchholz: Institute for Community Medicine, Medical University Greifswald, Greifswald, Germany. ORCID
Thomas Kohlmann: Institute for Community Medicine, Medical University Greifswald, Greifswald, Germany. ORCID
Peter Martus: Institute for Clinical Epidemiology and Applied Biostatistics, Medical University of Tübingen, Silcherstraße 5, 72076, Tübingen, Germany. ORCID
You-Shan Feng: Institute for Clinical Epidemiology and Applied Biostatistics, Medical University of Tübingen, Silcherstraße 5, 72076, Tübingen, Germany. ORCID
PURPOSE: Although multiple imputation is the state-of-the-art method for managing missing data, mixed models without multiple imputation may be equally valid for longitudinal data. Additionally, it is not clear whether missing values in multi-item instruments should be imputed at item or score-level. We therefore explored the differences in analyzing the scores of a health-related quality of life questionnaire (EQ-5D-5L) using four approaches in two empirical datasets. METHODS: We used simulated (GR dataset) and observed missingness patterns (ABCD dataset) in EQ-5D-5L scores to investigate the following approaches: approach-1) mixed models using respondents with complete cases, approach-2) mixed models using all available data, approach-3) mixed models after multiple imputation of the EQ-5D-5L scores, and approach-4) mixed models after multiple imputation of EQ-5D 5L items. RESULTS: Approach-1 yielded the highest estimates of all approaches (ABCD, GR), increasingly overestimating the EQ-5D-5L score with higher percentages of missing data (GR). Approach-4 produced the lowest scores at follow-up evaluations (ABCD, GR). Standard errors (0.006-0.008) and mean squared errors (0.032-0.035) increased with increasing percentages of simulated missing GR data. Approaches 2 and 3 showed similar results (both datasets). CONCLUSION: Complete cases analyses overestimated the scores and mixed models after multiple imputation by items yielded the lowest scores. As there was no loss of accuracy, mixed models without multiple imputation, when baseline covariates are complete, might be the most parsimonious choice to deal with missing data. However, multiple imputation may be needed when baseline covariates are missing and/or more than two timepoints are considered.
Rombach, I., Gray, A. M., Jenkinson, C., Murray, D. W., & Rivero-Arias, O. (2018). Multiple imputation for patient reported outcome measures in randomised controlled trials: Advantages and disadvantages of imputing at the item, subscale or composite score level. BMC Medical Research Methodology, 18, 87.
[PMID: 30153796]
Simons, C. L., Rivero-Arias, O., Yu, L. M., & Simon, J. (2015). Multiple imputation to deal with missing EQ-5D-3L data: Should we impute individual domains or the actual index? Quality of Life Research, 24, 805–815.
[PMID: 25471286]
Matza, L. S., Boye, K. S., Stewart, K. D., Curtis, B. H., Reaney, M., & Landrian, A. S. (2015). A qualitative examination of the content validity of the EQ-5D-5L in patients with type 2 diabetes. Health and Quality of Life Outcomes, 13, 192.
[PMID: 26627874]
Twisk, J., de Boer, M., de Vente, W., & Heymans, M. (2013). Multiple imputation of missing values was not necessary before performing a longitudinal mixed-model analysis. Journal of Clinical Epidemiology, 66, 1022–1028.
[PMID: 23790725]
Grady, K. L., Jones, P. G., Cristian-Andrei, A., Naftel, D. C., Myers, S., Dew, M. A., Idrissi, K., Weidner, G., Wissman, S. A., Kirklin, J. K., & Spertus, J. A. (2017). Causes and consequences of missing health-related quality of life assessments in patients who undergo mechanical circulatory support implantation: Insights from INTERMACS (interagency registry for mechanically assisted circulatory support). Circulation Cardiovascular Quality and Outcomes, 10, e003268.
[PMID: 29246883]
Faria, R., Gomes, M., Epstein, D., & White, I. R. (2014). A guide to handling missing data in cost-effectiveness analysis conducted within randomised controlled trials. PharmacoEconomics, 32, 1157–1170.
[PMID: 25069632]
Hutchings, A., Neuburger, J., Grosse Frie, K., Black, N., & van der Meulen, J. (2012). Factors associated with non-response in routine use of patient reported outcome measures after elective surgery in England. Health and Quality of Life Outcomes, 10, 34.
[PMID: 22462512]
Pedersen, A. B., Mikkelsen, E. M., Cronin-Fenton, D., Kristensen, N. R., Pham, T. M., Pedersen, L., & Petersen, I. (2017). Missing data and multiple imputation in clinical epidemiological research. Clinical Epidemiology, 9, 157–166.
[PMID: 28352203]
de Leeuw, E., Hox, J., & Mark, H. (2003). Prevention and treatment of item nonresponse. Journal of Official Statistics, 19, 153–176.
Halme, A. S., & Tannenbaum, C. (2018). Performance of a Bayesian approach for imputing missing data on the SF-12 health-related quality-of-life measure. Value Health, 21, 1406–1412.
[PMID: 30502784]
Peters, S. A., Bots, M. L., den Ruijter, H. M., Palmer, M. K., Grobbee, D. E., Crouse, J. R., 3rd., O’Leary, D. H., Evans, G. W., Raichlen, J. S., Moons, K. G., et al. (2012). Multiple imputation of missing repeated outcome measurements did not add to linear mixed-effects models. Journal of Clinical Epidemiology, 65, 686–695.
[PMID: 22459429]
Enders, C. K. (2017). Multiple imputation as a flexible tool for missing data handling in clinical research. Behaviour Research and Therapy, 98, 4–18.
[PMID: 27890222]
Austin, P. C., White, I. R., Lee, D. S., & van Buuren, S. (2020). Missing data in clinical research: A tutorial on multiple imputation. Canadian Journal of Cardiology, 37, 1322–1331.
[DOI: 10.1016/j.cjca.2020.11.010]
Huque, M. H., Moreno-Betancur, M., Quartagno, M., Simpson, J. A., Carlin, J. B., & Lee, K. J. (2020). Multiple imputation methods for handling incomplete longitudinal and clustered data where the target analysis is a linear mixed effects model. Biometrical Journal, 62, 444–466.
[PMID: 31919921]
Twisk, J. W., Rijnhart, J. J., Hoekstra, T., Schuster, N. A., Ter Wee, M. M., & Heymans, M. W. (2020). Intention-to-treat analysis when only a baseline value is available. Contemporary Clinical Trials Communications, 20, 100684.
[PMID: 33319119]
Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330–351.
[PMID: 11778676]
Allison, P. D. (2012). Handling Missing Data by Maximum Likelihood In SAS Global Forum 2012. Statistical Horizons.
Feng, Y., Parkin, D., & Devlin, N. J. (2014). Assessing the performance of the EQ-VAS in the NHS PROMs programme. Quality of Life Research, 23, 977–989.
[PMID: 24081873]
Al Sayah, F., Majumdar, S. R., Soprovich, A., Wozniak, L., Johnson, S. T., Qiu, W., Rees, S., & Johnson, J. A. (2015). The Alberta’s caring for diabetes (ABCD) study: Rationale, design and baseline characteristics of a prospective cohort of adults with type 2 diabetes. Canadian Journal of Diabetes, 39(Suppl 3), S113-119.
[PMID: 26243463]
Xie, F., Pullenayegum, E., Gaebel, K., Bansback, N., Bryan, S., Ohinmaa, A., Poissant, L., & Johnson, J. A. (2016). Canadian EQDLVSG: A time trade-off-derived value set of the EQ-5D-5L for Canada. Medical Care, 54, 98–105.
[PMID: 26492214]
Buchholz, I., Feng, Y. S., Buchholz, M., Kazis, L. E., & Kohlmann, T. (2021). Translation and adaptation of the German version of the veterans rand-36/12 item health survey. Health and Quality of Life Outcomes, 19, 137.
[PMID: 33947411]
Ludwig, K., Graf von der Schulenburg, J. M., & Greiner, W. (2018). German value set for the EQ-5D-5L. PharmacoEconomics, 36, 663–674.
[PMID: 29460066]
Buuren, S. V. (2018). Generating MAR missing data. In S. Buuren (Ed.), Flexible Imputation of Missing Data (2nd ed., Vol. 1). Hall/CRC.
[DOI: 10.1201/9780429492259]
Welch, C. A., Sabia, S., Brunner, E., Kivimaki, M., & Shipley, M. J. (2018). Does pattern mixture modelling reduce bias due to informative attrition compared to fitting a mixed effects model to the available cases or data imputed using multiple imputation?: A simulation study. BMC Medical Research Methodology, 18, 89.
[PMID: 30157752]
Schouten Rianne, M., & Vink, G. (2018). The dance of the mechanisms: How observed information influences the validity of missingness assumptions. Sociological Methods & Research, 50, 1243–1258.
[DOI: 10.1177/0049124118799376]
Schouten, R. M., Lugtig, P., & Vink, G. (2018). Generating missing values for simulation purposes: A multivariate amputation procedure. Journal of Statistical Computation and Simulation, 88, 2909–2930.
[DOI: 10.1080/00949655.2018.1491577]
Honaker, J., King, G., & Blackwell, M. (2011). Amelia II: A program for missing data. Journal of Statistical Software, 2011(45), 47.
Jamshidian, M., & Mata, M. (2007). Handbook of latent variable and related models A volume in handbook of computing and statistics with applications. In S.-Y. Lee (Ed.), Advances in Analysis of Mean and Covariance Structure when Data are Incomplete (pp. 21–44). North Holland: Elsevier.
Wijesuriya, R., Moreno-Betancur, M., Carlin, J. B., & Lee, K. J. (2020). Evaluation of approaches for multiple imputation of three-level data. BMC Medical Research Methodology, 20, 207.
[PMID: 32787781]
Grittner, U., Gmel, G., Ripatti, S., Bloomfield, K., & Wicki, M. (2011). Missing value imputation in longitudinal measures of alcohol consumption. International Journal of Methods in Psychiatric Research, 20, 50–61.
[PMID: 21556290]
Jagdhuber, R. (2015). Multiple Imputation in Generalized Linear Mixed Models. Ludwig Maximilians University Munich.
Zhang, Z. (2016). Multiple imputation for time series data with Amelia package. Annals of Translational Medicine, 4, 56.
[PMID: 26904578]
Van Buuren, S. (2012). Flexible Imputation of Missing Data. Chapman and Hall/CRC.
[DOI: 10.1201/b11826]
Devlin, N., Parkin, D., & Janssen, B. (2020). Methods for Analysing and Reporting EQ-5D Data. Springer International Publishing.
[DOI: 10.1007/978-3-030-47622-9]
Ratcliffe, J., Young, T., Longworth, L., & Buxton, M. (2005). An assessment of the impact of informative dropout and nonresponse in measuring health-related quality of life using the EuroQol (EQ-5D) descriptive system. Value Health, 8, 53–58.
[PMID: 15841894]
Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147–177.
[PMID: 12090408]
Enders, C. K. (2010). Applied missing data analysis. In T. D. Little (Ed.), Methodology in the Social Sciences. New York: The Guildford Press.
Robins, J. M., & Wang, N. (2000). Inference for imputation estimators. Biometrika, 87, 113–124.
[DOI: 10.1093/biomet/87.1.113]
Biering, K., Hjollund, N. H., & Frydenberg, M. (2015). Using multiple imputation to deal with missing data and attrition in longitudinal studies with repeated measures of patient-reported outcomes. Clinical Epidemiology, 7, 91–106.
[PMID: 25653557]
Rawlings, A. M., Sang, Y., Sharrett, A. R., Coresh, J., Griswold, M., Kucharska-Newton, A. M., Palta, P., Wruck, L. M., Gross, A. L., Deal, J. A., et al. (2017). Multiple imputation of cognitive performance as a repeatedly measured outcome. European Journal of Epidemiology, 32, 55–66.
[PMID: 27619926]
Feng, Y. S., Kohlmann, T., Janssen, M. F., & Buchholz, I. (2021). Psychometric properties of the EQ-5D-5L: A systematic review of the literature. Quality of Life Research, 30, 647–673.
[PMID: 33284428]
Feng, Y., Jiang, R., Pickard, A., & Kohlmann, T. (2021). Combining EQ-5D-5L items into a level summary score: demonstrating feasibility using non-parametric item response theory using an international dataset. Quality of Life Research. https://doi.org/10.1007/s11136-021-02922-1
[DOI: 10.1007/s11136-021-02922-1]
Lee, K. J., & Carlin, J. B. (2010). Multiple imputation for missing data: Fully conditional specification versus multivariate normal imputation. American Journal of Epidemiology, 171, 624–632.
[PMID: 20106935]
King, G., Honaker, J., Joseph, A., & Scheve, K. (2000). Analyzing incomplete political science data: An alternative algorithm for multiple imputation. American Political Science Review, 95, 49–69.
[DOI: 10.1017/S0003055401000235]
Nguyen, C. D., Carlin, J. B., & Lee, K. J. (2021). Practical strategies for handling breakdown of multiple imputation procedures. Emerging Themes in Epidemiology, 18, 5.
[PMID: 33794933]