Detection of COVID-19 epidemic outbreak using machine learning.

Giphil Cho, Jeong Rye Park, Yongin Choi, Hyeonjeong Ahn, Hyojung Lee
Author Information
  1. Giphil Cho: Department of Artificial Intelligence and Software, Kangwon National University, Samcheok-si, Republic of Korea.
  2. Jeong Rye Park: Department of Mathematics, Kyungpook National University, Daegu, Republic of Korea.
  3. Yongin Choi: Busan Center for Medical Mathematics, National Institute for Mathematical Sciences, Daejeon, Republic of Korea.
  4. Hyeonjeong Ahn: Department of Statistics, Kyungpook National University, Daegu, Republic of Korea.
  5. Hyojung Lee: Department of Statistics, Kyungpook National University, Daegu, Republic of Korea.

Abstract

Background: The coronavirus disease (COVID-19) pandemic has spread rapidly across the world, creating an urgent need for predictive models that can help healthcare providers prepare and respond to outbreaks more quickly and effectively, and ultimately improve patient care. Early detection and warning systems are crucial for preventing and controlling epidemic spread.
Objective: In this study, we aimed to propose a machine learning-based method to predict the transmission trend of COVID-19 and a new approach to detect the start time of new outbreaks by analyzing epidemiological data.
Methods: We developed a risk index to measure the change in the transmission trend. We applied machine learning (ML) techniques to predict COVID-19 transmission trends, categorized into three labels: decrease (L0), maintain (L1), and increase (L2). We used Support Vector Machine (SVM), Random Forest (RF), and XGBoost (XGB) as ML models. We employed grid search methods to determine the optimal hyperparameters for these three models. We proposed a new method to detect the start time of new outbreaks based on label 2, which was sustained for at least 14 days (i.e., the duration of maintenance). We compared the performance of different ML models to identify the most accurate approach for outbreak detection. We conducted sensitivity analysis for the duration of maintenance between 7 days and 28 days.
Results: ML methods demonstrated high accuracy (over 94%) in estimating the classification of the transmission trends. Our proposed method successfully predicted the start time of new outbreaks, enabling us to detect a total of seven estimated outbreaks, while there were five reported outbreaks between March 2020 and October 2022 in Korea. It means that our method could detect minor outbreaks. Among the ML models, the RF and XGB classifiers exhibited the highest accuracy in outbreak detection.
Conclusion: The study highlights the strength of our method in accurately predicting the timing of an outbreak using an interpretable and explainable approach. It could provide a standard for predicting the start time of new outbreaks and detecting future transmission trends. This method can contribute to the development of targeted prevention and control measures and enhance resource management during the pandemic.

Keywords

References

  1. Sci Adv. 2021 Mar 5;7(10): [PMID: 33674304]
  2. Int J Environ Res Public Health. 2020 Oct 14;17(20): [PMID: 33066581]
  3. Biochem Biophys Res Commun. 2021 Jan 29;538:244-252 [PMID: 33342518]
  4. Philos Trans R Soc Lond B Biol Sci. 2021 Jul 19;376(1829):20200266 [PMID: 34053271]
  5. Chaos Solitons Fractals. 2021 Jan;142:110512 [PMID: 33281306]
  6. Sci Rep. 2021 Dec 28;11(1):24470 [PMID: 34963690]
  7. Lancet Reg Health Southeast Asia. 2023 Jan;8:100095 [PMID: 36267800]
  8. Environ Health Perspect. 2016 Sep;124(9):1369-75 [PMID: 26662617]
  9. Sci Rep. 2022 Jan 28;12(1):1554 [PMID: 35091640]
  10. BMJ Glob Health. 2020 Sep;5(9): [PMID: 32948617]
  11. Nat Commun. 2021 Jun 16;12(1):3674 [PMID: 34135335]
  12. Chaos Solitons Fractals. 2020 Jun;135:109850 [PMID: 32355424]
  13. PLoS One. 2022 Nov 16;17(11):e0277671 [PMID: 36383630]
  14. Sensors (Basel). 2021 Jun 18;21(12): [PMID: 34207437]
  15. Sci Rep. 2022 Jul 19;12(1):12337 [PMID: 35853927]
  16. BMJ. 2022 Sep 15;378:e070615 [PMID: 36109042]
  17. Nat Med. 2021 Jun;27(6):993-998 [PMID: 33864052]
  18. JMIR Public Health Surveill. 2021 Jun 1;7(6):e26784 [PMID: 33819165]
  19. EPJ Data Sci. 2020;9(1):28 [PMID: 32934899]
  20. Arab J Sci Eng. 2022;47(8):10163-10186 [PMID: 35018276]
  21. Chaos Solitons Fractals. 2020 Nov;140:110212 [PMID: 32839642]
  22. Lancet Infect Dis. 2021 Jun;21(6):793-802 [PMID: 33743847]
  23. Signal Image Video Process. 2021;15(5):959-966 [PMID: 33432267]
  24. J Biomed Inform. 2021 Jun;118:103791 [PMID: 33915272]

MeSH Term

Humans
COVID-19
Disease Outbreaks
Pandemics
Health Personnel
Machine Learning

Word Cloud

Created with Highcharts 10.0.0outbreaksmethodnewCOVID-19modelstransmissionMLoutbreakdetectionmachinedetectstarttimeapproachlearningtrendspandemicspreadcanepidemicstudypredicttrendthreeRFXGBmethodsproposeddurationmaintenanceaccuracypredictingusingBackground:coronavirusdiseaserapidlyacrossworldcreatingurgentneedpredictivehelphealthcareproviderspreparerespondquicklyeffectivelyultimatelyimprovepatientcareEarlywarningsystemscrucialpreventingcontrollingObjective:aimedproposelearning-basedanalyzingepidemiologicaldataMethods:developedriskindexmeasurechangeappliedtechniquescategorizedlabels:decreaseL0maintainL1increaseL2usedSupportVectorMachineSVMRandomForestXGBoostemployedgridsearchdetermineoptimalhyperparametersbasedlabel2sustainedleast14 daysiecomparedperformancedifferentidentifyaccurateconductedsensitivityanalysis7 days28 daysResults:demonstratedhigh94%estimatingclassificationsuccessfullypredictedenablingustotalsevenestimatedfivereportedMarch2020October2022KoreameansminorAmongclassifiersexhibitedhighestConclusion:highlightsstrengthaccuratelytiminginterpretableexplainableprovidestandarddetectingfuturecontributedevelopmenttargetedpreventioncontrolmeasuresenhanceresourcemanagementDetectionearlyprediction

Similar Articles

Cited By