The Use of Multiple Imputation for Data Subject to Limits of Detection.

Ofer Harel, Neil Perkins, Enrique F Schisterman
Author Information
  1. Ofer Harel: Department of Statistics, University of Connecticut, USA.
  2. Neil Perkins: Epidemiology Branch, Eunice Kennedy Shriver National Institute for Child and Human Development, Rockvile, MD, USA.
  3. Enrique F Schisterman: Epidemiology Branch, Eunice Kennedy Shriver National Institute for Child and Human Development, Rockvile, MD, USA.

Abstract

Missing data due to limit of detection and limit of quantification is a common obstacle in epidemiological and biomedical research. We are interested in methodologies that provide unbiased and efficient estimates of these missing data while using popular statistical software. We describe a multiple imputation (MI) procedure for cross-sectional and longitudinal data which examines the sources of variation of hormones levels throughout the menstrual cycle conditional on specific biomarkers. We describe the rational, procedure, advantages and disadvantages of the multiple imputation procedure. We also provide a comparison to commonly used missing data procedures (complete cases analysis and single imputation). We illustrate our approach using the BioCycle data where we are interested in the effects of Vitamin E and Beta-carotene on Progesterone levels. We also evaluate the longitudinal impact of changes in Vitamin E on Progesterone levels over time. Finaly, we demonstrate the advantages of using MI over complete case analysis or naive single replacement in both cross-sectional and longitudinal analysis where measurements below the limit of quantification (LOQ) are unreported. We also illustrate that if available, inclusion of potentially demined unreliable data below the limit of detection (LOD) improves simple estimation substantially.

Keywords

References

  1. Biometrics. 2001 Mar;57(1):22-33 [PMID: 11252602]
  2. Stat Med. 2007 Jul 20;26(16):3057-77 [PMID: 17256804]
  3. Stat Med. 2003 Feb 15;22(3):409-25 [PMID: 12529872]
  4. Epidemiology. 2010 Jul;21 Suppl 4:S17-24 [PMID: 21422965]
  5. Stat Med. 2001 Jan 15;20(1):33-45 [PMID: 11135346]
  6. Epidemiology. 2010 Jul;21 Suppl 4:S25-34 [PMID: 20386106]
  7. Stat Methods Med Res. 2002 Aug;11(4):303-16 [PMID: 12197298]
  8. Am J Epidemiol. 2003 Feb 15;157(4):355-63 [PMID: 12578806]
  9. Psychol Methods. 2001 Dec;6(4):330-51 [PMID: 11778676]
  10. Paediatr Perinat Epidemiol. 2009 Mar;23(2):171-84 [PMID: 19159403]
  11. Am J Epidemiol. 2006 Feb 15;163(4):374-83 [PMID: 16394206]
  12. Epidemiology. 2010 Jul;21 Suppl 4:S10-6 [PMID: 20526201]
  13. Psychol Methods. 2002 Jun;7(2):147-77 [PMID: 12090408]

Grants

  1. K01 MH087219/NIMH NIH HHS
  2. Z01 HD008761-05/Intramural NIH HHS

Word Cloud

Created with Highcharts 10.0.0dataanalysislimitimputationprocedureusinglongitudinallevelsalsodetectionquantificationinterestedprovidemissingdescribemultipleMIcross-sectionaladvantagescompletesingleillustrateVitaminEProgesteronecaseMultipleMissingduecommonobstacleepidemiologicalbiomedicalresearchmethodologiesunbiasedefficientestimatespopularstatisticalsoftwareexaminessourcesvariationhormonesthroughoutmenstrualcycleconditionalspecificbiomarkersrationaldisadvantagescomparisoncommonlyusedprocedurescasesapproachBioCycleeffectsBeta-caroteneevaluateimpactchangestimeFinalydemonstratenaivereplacementmeasurementsLOQunreportedavailableinclusionpotentiallydeminedunreliableLODimprovessimpleestimationsubstantiallyUseImputationDataSubjectLimitsDetectionCompleteCrosssectionalLongitudinal

Similar Articles

Cited By