Imputation methods for missing data for polygenic models.

Brooke Fridley, Kari Rabe, Mariza de Andrade
Author Information
  1. Brooke Fridley: Department of Statistics, Iowa State University, Ames, Iowa, USA. Fridley.broo@uwlax.edu

Abstract

Methods to handle missing data have been an area of statistical research for many years. Little has been done within the context of pedigree analysis. In this paper we present two methods for imputing missing data for polygenic models using family data. The imputation schemes take into account familial relationships and use the observed familial information for the imputation. A traditional multiple imputation approach and multiple imputation or data augmentation approach within a Gibbs sampler for the handling of missing data for a polygenic model are presented.We used both the Genetic Analysis Workshop 13 simulated missing phenotype and the complete phenotype data sets as the means to illustrate the two methods. We looked at the phenotypic trait systolic blood pressure and the covariate gender at time point 11 (1970) for Cohort 1 and time point 1 (1971) for Cohort 2. Comparing the results for three replicates of complete and missing data incorporating multiple imputation, we find that multiple imputation via a Gibbs sampler produces more accurate results. Thus, we recommend the Gibbs sampler for imputation purposes because of the ease with which it can be extended to more complicated models, the consistency of the results, and the accountability of the variation due to imputation.

References

  1. Biometrics. 2001 Mar;57(1):22-33 [PMID: 11252602]
  2. Ann Hum Genet. 1982 Oct;46(Pt 4):373-83 [PMID: 6961886]
  3. IEEE Trans Pattern Anal Mach Intell. 1984 Jun;6(6):721-41 [PMID: 22499653]
  4. Ann Hum Genet. 1976 May;39(4):485-91 [PMID: 952492]
  5. Biometrics. 1990 Jun;46(2):399-413 [PMID: 2364130]
  6. Genet Epidemiol. 1999;17(1):64-76 [PMID: 10323185]

Grants

  1. R01 HL071917/NHLBI NIH HHS
  2. R01HL71917/NHLBI NIH HHS

MeSH Term

Adult Children
Bayes Theorem
Blood Pressure
Cohort Studies
Computer Simulation
Factor Analysis, Statistical
Female
Humans
Male
Models, Genetic
Models, Statistical
Multifactorial Inheritance
Phenotype
Research Design
Sampling Studies
Systole

Word Cloud

Created with Highcharts 10.0.0dataimputationmissingmultiplemethodspolygenicmodelsGibbssamplerresultswithintwofamilialapproachphenotypecompletetimepointCohort1MethodshandleareastatisticalresearchmanyyearsLittledonecontextpedigreeanalysispaperpresentimputingusingfamilyschemestakeaccountrelationshipsuseobservedinformationtraditionalaugmentationhandlingmodelpresentedWeusedGeneticAnalysisWorkshop13simulatedsetsmeansillustratelookedphenotypictraitsystolicbloodpressurecovariategender11197019712ComparingthreereplicatesincorporatingfindviaproducesaccurateThusrecommendpurposeseasecanextendedcomplicatedconsistencyaccountabilityvariationdueImputation

Similar Articles

Cited By