Multiple imputation using chained equations: Issues and guidance for practice.

Ian R White, Patrick Royston, Angela M Wood
Author Information
  1. Ian R White: MRC Biostatistics Unit, Institute of Public Health, Robinson Way, Cambridge CB2 0SR, U.K.. ian.white@mrc-bsu.cam.ac.uk.

Abstract

Multiple imputation by chained equations is a flexible and practical approach to handling missing data. We describe the principles of the method and show how to impute categorical and quantitative variables, including skewed variables. We give guidance on how to specify the imputation model and how many imputations are needed. We describe the practical analysis of multiply imputed data, including model building and model checking. We stress the limitations of the method and discuss the possible pitfalls. We illustrate the ideas using a data set in mental health, giving Stata code fragments.

Grants

  1. MC_U105260558/Medical Research Council
  2. U.1228.06.01.00002.01/Medical Research Council
  3. U.1052.00.006/Medical Research Council

MeSH Term

Adolescent
Adult
Aged
Cardiovascular Diseases
Cholesterol
Female
Humans
Lipoproteins, HDL
Mental Health
Middle Aged
Models, Statistical
Multicenter Studies as Topic
Young Adult

Chemicals

Lipoproteins, HDL
Cholesterol

Word Cloud

Created with Highcharts 10.0.0imputationdatamodelMultiplechainedpracticaldescribemethodvariablesincludingguidanceusingequationsflexibleapproachhandlingmissingprinciplesshowimputecategoricalquantitativeskewedgivespecifymanyimputationsneededanalysismultiplyimputedbuildingcheckingstresslimitationsdiscusspossiblepitfallsillustrateideassetmentalhealthgivingStatacodefragmentsequations:Issuespractice

Similar Articles

Cited By