Statistical outbreak detection by joining medical records and pathogen similarity.
James K Miller, Jieshi Chen, Alexander Sundermann, Jane W Marsh, Melissa I Saul, Kathleen A Shutt, Marissa Pacey, Mustapha M Mustapha, Lee H Harrison, Artur Dubrawski
Author Information
James K Miller: Auton Lab, Carnegie Mellon University, Pittsburgh, PA, United States. Electronic address: mille856@andrew.cmu.edu.
Jieshi Chen: Auton Lab, Carnegie Mellon University, Pittsburgh, PA, United States.
Alexander Sundermann: Infectious Diseases Epidemiology Research Unit, University of Pittsburgh School of Medicine and Graduate School of Public Health, Pittsburgh, PA, United States; Department of Infection Control and Hospital Epidemiology, University of Pittsburgh Medical Center, Pittsburgh, PA, United States.
Jane W Marsh: Infectious Diseases Epidemiology Research Unit, University of Pittsburgh School of Medicine and Graduate School of Public Health, Pittsburgh, PA, United States.
Melissa I Saul: Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, United States.
Kathleen A Shutt: Infectious Diseases Epidemiology Research Unit, University of Pittsburgh School of Medicine and Graduate School of Public Health, Pittsburgh, PA, United States.
Marissa Pacey: Infectious Diseases Epidemiology Research Unit, University of Pittsburgh School of Medicine and Graduate School of Public Health, Pittsburgh, PA, United States.
Mustapha M Mustapha: Infectious Diseases Epidemiology Research Unit, University of Pittsburgh School of Medicine and Graduate School of Public Health, Pittsburgh, PA, United States.
Lee H Harrison: Infectious Diseases Epidemiology Research Unit, University of Pittsburgh School of Medicine and Graduate School of Public Health, Pittsburgh, PA, United States.
Artur Dubrawski: Auton Lab, Carnegie Mellon University, Pittsburgh, PA, United States.
We present a statistical inference model for the detection and characterization of outbreaks of hospital associated infection. The approach combines patient exposures, determined from electronic medical records, and pathogen similarity, determined by whole-genome sequencing, to simultaneously identify probable outbreaks and their root-causes. We show how our model can be used to target isolates for whole-genome sequencing, improving outbreak detection and characterization even without comprehensive sequencing. Additionally, we demonstrate how to learn model parameters from reference data of known outbreaks. We demonstrate model performance using semi-synthetic experiments.