Application of machine learning for multi-community COVID-19 outbreak predictions with wastewater surveillance.

Yuehan Ai, Fan He, Emma Lancaster, Jiyoung Lee
Author Information
  1. Yuehan Ai: Department of Food Science and Technology, The Ohio State University, Columbus, OH, United States of America.
  2. Fan He: Department of Food Science and Technology, The Ohio State University, Columbus, OH, United States of America. ORCID
  3. Emma Lancaster: Division of Environmental Health Sciences, College of Public Health, The Ohio State University, Columbus, OH, United States of America.
  4. Jiyoung Lee: Department of Food Science and Technology, The Ohio State University, Columbus, OH, United States of America. ORCID

Abstract

The potential of wastewater-based epidemiology (WBE) as a surveillance and early warning tool for the COVID-19 outbreak has been demonstrated. For areas with limited testing capacity, wastewater surveillance can provide information on the disease dynamic at a community level. A predictive model is a key to generating quantitative estimates of the infected population. Modeling longitudinal wastewater data can be challenging as biomarkers in wastewater are susceptible to variations caused by multiple factors associated with the wastewater matrix and the sewersheds characteristics. As WBE is an emerging trend, the model should be able to address the uncertainties of wastewater from different sewersheds. We proposed exploiting machine learning and deep learning techniques, which are supported by the growing WBE data. In this article, we reviewed the existing predictive models, among which the emerging machine learning/deep learning models showed great potential. However, most models are built for individual sewersheds with few features extracted from the wastewater. To fulfill the research gap, we compared different time-series and non-time-series models for their short-term predictive performance of COVID-19 cases in 9 diverse sewersheds. The time-series models, long short-term memory (LSTM) and Prophet, outcompeted the non-time-series models. Besides viral (SARS-CoV-2) loads and location identity, domain-specific features like biochemical parameters of wastewater, geographical parameters of the sewersheds, and some socioeconomic parameters of the communities can contribute to the models. With proper feature engineering and hyperparameter tuning, we believe machine learning models like LSTM can be a feasible solution for the COVID-19 trend prediction via WBE. Overall, this is a proof-of-concept study on the application of machine learning in COVID-19 WBE. Future studies are needed to deploy and maintain the model in more real-world applications.

References

  1. PLoS One. 2022 Jan 28;17(1):e0262708 [PMID: 35089976]
  2. Emerg Infect Dis. 2020 Jun;26(6):1337-1339 [PMID: 32150527]
  3. Sci Total Environ. 2020 Sep 20;736:139631 [PMID: 32474280]
  4. Sci Total Environ. 2021 Sep 10;786:147451 [PMID: 33971608]
  5. Sci Total Environ. 2020 Aug 15;730:138875 [PMID: 32371231]
  6. Sci Total Environ. 2022 Jan 10;803:149834 [PMID: 34525746]
  7. Lancet Planet Health. 2021 Dec;5(12):e874-e881 [PMID: 34895497]
  8. Environ Res. 2021 Sep;200:111749 [PMID: 34310965]
  9. Environ Sci Technol. 2020 Jul 7;54(13):7754-7757 [PMID: 32530639]
  10. Sci Total Environ. 2022 Jan 15;804:150151 [PMID: 34623953]
  11. Water Res. 2021 Sep 1;202:117438 [PMID: 34333296]
  12. mSystems. 2020 Jul 21;5(4): [PMID: 32694130]
  13. FEMS Microbes. 2022 Jan 10;2:xtab022 [PMID: 35128418]
  14. Environ Sci Technol. 2021 Mar 16;55(6):3514-3519 [PMID: 33656856]
  15. Int J Hyg Environ Health. 2020 Sep;230:113621 [PMID: 32911123]
  16. Water Res. 2020 Nov 1;186:116404 [PMID: 32942178]
  17. Environ Res. 2022 Dec;215(Pt 2):114290 [PMID: 36096171]
  18. Sci Total Environ. 2021 Dec 20;801:149757 [PMID: 34467932]
  19. Sci Total Environ. 2020 Aug 1;728:138764 [PMID: 32387778]
  20. Sci Total Environ. 2021 May 1;767:145124 [PMID: 33548842]

MeSH Term

Humans
COVID-19
SARS-CoV-2
Wastewater
Wastewater-Based Epidemiological Monitoring
Disease Outbreaks
Machine Learning
RNA, Viral

Chemicals

Waste Water
RNA, Viral

Word Cloud

Created with Highcharts 10.0.0wastewatermodelslearningWBECOVID-19sewershedsmachinecansurveillancepredictivemodelparameterspotentialoutbreakdataemergingtrenddifferentfeaturestime-seriesnon-time-seriesshort-termLSTMlikewastewater-basedepidemiologyearlywarningtooldemonstratedareaslimitedtestingcapacityprovideinformationdiseasedynamiccommunitylevelkeygeneratingquantitativeestimatesinfectedpopulationModelinglongitudinalchallengingbiomarkerssusceptiblevariationscausedmultiplefactorsassociatedmatrixcharacteristicsableaddressuncertaintiesproposedexploitingdeeptechniquessupportedgrowingarticlereviewedexistingamonglearning/deepshowedgreatHoweverbuiltindividualextractedfulfillresearchgapcomparedperformancecases9diverselongmemoryProphetoutcompetedBesidesviralSARS-CoV-2loadslocationidentitydomain-specificbiochemicalgeographicalsocioeconomiccommunitiescontributeproperfeatureengineeringhyperparametertuningbelievefeasiblesolutionpredictionviaOverallproof-of-conceptstudyapplicationFuturestudiesneededdeploymaintainreal-worldapplicationsApplicationmulti-communitypredictions

Similar Articles

Cited By