Context-Aware Time Series Imputation for Multi-Analyte Clinical Data.

Kejing Yin, Liaoliao Feng, William K Cheung
Author Information
  1. Kejing Yin: Department of Computer Science, Hong Kong Baptist University, Hong Kong, SAR China. ORCID
  2. Liaoliao Feng: School of Computer Science & Technology, East China Normal University, Shanghai, China.
  3. William K Cheung: Department of Computer Science, Hong Kong Baptist University, Hong Kong, SAR China.

Abstract

Clinical time series imputation is recognized as an essential task in clinical data analytics. Most models rely either on strong assumptions regarding the underlying data-generation process or on preservation of only local properties without effective consideration of global dependencies. To advance the state of the art in clinical time series imputation, we participated in the 2019 ICHI Data Analytics Challenge on Missing Data Imputation (DACMI). In this paper, we present our proposed model: Context-Aware Time Series Imputation (CATSI), a novel framework based on a bidirectional LSTM in which patients' health states are explicitly captured by learning a "global context vector" from the entire clinical time series. The imputations are then produced with reference to the global context vector. We also incorporate a cross-feature imputation component to explore the complex feature correlations. Empirical evaluations demonstrate that CATSI obtains a normalized root mean square deviation (nRMSD) of 0.1998, which is 10.6% better than that of state-of-the-art models. Further experiments on consecutive missing datasets also illustrate the effectiveness of incorporating the global context in the generation of accurate imputations.

Keywords

References

  1. Theor Appl Genet. 2016 Nov;129(11):2101-2115 [PMID: 27540725]
  2. J Am Med Inform Assoc. 2018 Oct 1;25(10):1419-1428 [PMID: 29893864]
  3. Neural Comput. 2000 Oct;12(10):2451-71 [PMID: 11032042]
  4. Sci Rep. 2018 Apr 17;8(1):6085 [PMID: 29666385]
  5. IEEE Trans Biomed Eng. 2019 May;66(5):1477-1490 [PMID: 30296210]
  6. Philos Trans A Math Phys Eng Sci. 2012 Dec 31;371(1984):20110550 [PMID: 23277607]
  7. Int J Methods Psychiatr Res. 2011 Mar;20(1):40-9 [PMID: 21499542]
  8. J Neurosci Methods. 2015 Jun 15;248:59-69 [PMID: 25840362]
  9. Sci Data. 2016 May 24;3:160035 [PMID: 27219127]
  10. J Am Med Inform Assoc. 2018 Jun 1;25(6):645-653 [PMID: 29202205]

Word Cloud

Created with Highcharts 10.0.0timeseriesimputationClinicalclinicalglobalDataImputationcontextdatamodelsMissingContext-AwareTimeSeriesCATSIhealthimputationsalsorecognizedessentialtaskanalyticsrelyeitherstrongassumptionsregardingunderlyingdata-generationprocesspreservationlocalpropertieswithouteffectiveconsiderationdependenciesadvancestateartparticipated2019ICHIAnalyticsChallengeDACMIpaperpresentproposedmodel:novelframeworkbasedbidirectionalLSTMpatients'statesexplicitlycapturedlearning"globalvector"entireproducedreferencevectorincorporatecross-featurecomponentexplorecomplexfeaturecorrelationsEmpiricalevaluationsdemonstrateobtainsnormalizedrootmeansquaredeviationnRMSD01998106%betterstate-of-the-artexperimentsconsecutivemissingdatasetsillustrateeffectivenessincorporatinggenerationaccurateMulti-AnalyteElectronicrecords

Similar Articles

Cited By