A dynamic approach to support outbreak management using reinforcement learning and semi-connected SEIQR models.

Yamin Kao, Po-Jui Chu, Pai-Chien Chou, Chien-Chang Chen
Author Information
  1. Yamin Kao: Geometric Data Vision Laboratory, Department of Biomedical Sciences and Engineering, National Central University, Taoyuan City, Taiwan.
  2. Po-Jui Chu: Geometric Data Vision Laboratory, Department of Biomedical Sciences and Engineering, National Central University, Taoyuan City, Taiwan.
  3. Pai-Chien Chou: Division of Pulmonary Medicine, Department of Internal Medicine, Taipei Medical University Hospital, Taipei, Taiwan.
  4. Chien-Chang Chen: Geometric Data Vision Laboratory, Department of Biomedical Sciences and Engineering, National Central University, Taoyuan City, Taiwan. gettgod@ncu.edu.tw.

Abstract

BACKGROUND: Containment measures slowed the spread of COVID-19 but led to a global economic crisis. We establish a reinforcement learning (RL) algorithm that balances disease control and economic activities.
METHODS: To train the RL agent, we design an RL environment with 4 semi-connected regions to represent the COVID-19 epidemic in Tokyo, Osaka, Okinawa, and Hokkaido, Japan. Every region is governed by a Susceptible-Exposed-Infected-Quarantined-Removed (SEIQR) model and has a transport hub to connect with other regions. The allocation of the synthetic population and inter-regional traveling is determined by population-weighted density. The agent learns the best policy from interacting with the RL environment, which involves obtaining daily observations, performing actions on individual movement and screening, and receiving feedback from the reward function. After training, we implement the agent into RL environments describing the actual epidemic waves of the four regions to observe the agent's performance.
RESULTS: For all epidemic waves covered by our study, the trained agent reduces the peak number of infectious cases and shortens the epidemics (from 165 to 35 cases and 148 to 131 days for the 5th wave). The agent is generally strict on screening but easy on movement, except for Okinawa, where the agent is easy on both actions. Action timing analyses indicate that restriction on movement is elevated when the number of exposed or infectious cases remains high or infectious cases increase rapidly, and stringency on screening is eased when the number of exposed or infectious cases drops quickly or to a regional low. For Okinawa, action on screening is tightened when the number of exposed or infectious cases increases rapidly.
CONCLUSIONS: Our experiments exhibit the potential of the RL in assisting policy-making and how the semi-connected SEIQR models establish an interactive environment for imitating cross-regional human flows.

Keywords

References

  1. J Indian Inst Sci. 2020;100(4):793-807 [PMID: 33144763]
  2. Nonlinear Dyn. 2020;101(3):1667-1680 [PMID: 32836803]
  3. Paediatr Respir Rev. 2020 Sep;35:57-60 [PMID: 32690354]
  4. Science. 2020 Jul 24;369(6502):368-369 [PMID: 32703861]
  5. Infect Dis Model. 2020 Jul 24;5:502-509 [PMID: 32766462]
  6. Int J Environ Res Public Health. 2021 Apr 23;18(9): [PMID: 33922693]
  7. Chaos Solitons Fractals. 2020 Oct;139:110057 [PMID: 32834610]
  8. Lancet Reg Health West Pac. 2020 Nov;4:100044 [PMID: 34013216]
  9. Appl Intell (Dordr). 2020;50(11):3913-3925 [PMID: 34764546]
  10. Front Public Health. 2020 May 29;8:241 [PMID: 32574307]
  11. Nat Med. 2020 Dec;26(12):1829-1834 [PMID: 33020651]
  12. BMJ Glob Health. 2020 Jun;5(6): [PMID: 32565430]
  13. Science. 2020 May 1;368(6490):493-497 [PMID: 32213647]
  14. Curr Res Behav Sci. 2020 Nov;1:100002 [PMID: 38620333]
  15. Sci Rep. 2020 Dec 16;10(1):22106 [PMID: 33328551]
  16. JAMA Intern Med. 2021 Jul 1;181(7):922 [PMID: 33821884]
  17. Neural Comput. 2019 Jul;31(7):1235-1270 [PMID: 31113301]
  18. Proc Natl Acad Sci U S A. 2020 Dec 1;117(48):30285-30294 [PMID: 33177237]
  19. Trop Med Health. 2020 Nov 23;48(1):91 [PMID: 33292755]
  20. IEEE Access. 2020;8:175244-175264 [PMID: 34868798]
  21. Adv Exp Med Biol. 2021;1318:825-837 [PMID: 33973214]
  22. Neural Comput Appl. 2022;34(18):15313-15348 [PMID: 35702664]
  23. Bull Math Biol. 2020 Apr 8;82(4):52 [PMID: 32270376]
  24. Chaos Solitons Fractals. 2020 Jun;135:109864 [PMID: 32390691]

Grants

  1. 111-2221-E-008-087/The National Science and Technology Council, Taiwan
  2. 111-2221-E-008-087/The National Science and Technology Council, Taiwan
  3. 111-2221-E-008-087/The National Science and Technology Council, Taiwan

MeSH Term

Humans
Reinforcement, Psychology
Learning
Reward
COVID-19
Communicable Diseases
Epidemics

Word Cloud

Created with Highcharts 10.0.0RLagentcasesinfectiousSEIQRscreeningnumberCOVID-19learningenvironmentsemi-connectedregionsepidemicOkinawamovementexposedeconomicestablishreinforcementmodelhubdensityactionswaveseasyrapidlymodelsBACKGROUND:ContainmentmeasuresslowedspreadledglobalcrisisalgorithmbalancesdiseasecontrolactivitiesMETHODS:traindesign4representTokyoOsakaHokkaidoJapanEveryregiongovernedSusceptible-Exposed-Infected-Quarantined-Removedtransportconnectallocationsyntheticpopulationinter-regionaltravelingdeterminedpopulation-weightedlearnsbestpolicyinteractinginvolvesobtainingdailyobservationsperformingindividualreceivingfeedbackrewardfunctiontrainingimplementenvironmentsdescribingactualfourobserveagent'sperformanceRESULTS:coveredstudytrainedreducespeakshortensepidemics16535148131days5thwavegenerallystrictexceptActiontiminganalysesindicaterestrictionelevatedremainshighincreasestringencyeaseddropsquicklyregionallowactiontightenedincreasesCONCLUSIONS:experimentsexhibitpotentialassistingpolicy-makinginteractiveimitatingcross-regionalhumanflowsdynamicapproachsupportoutbreakmanagementusingPopulation-weightedReinforcementTransport

Similar Articles

Cited By