Early Warning Scores With and Without Artificial Intelligence.

Dana P Edelson, Matthew M Churpek, Kyle A Carey, Zhenqui Lin, Chenxi Huang, Jonathan M Siner, Jennifer Johnson, Harlan M Krumholz, Deborah J Rhodes
Author Information
  1. Dana P Edelson: Section of Hospital Medicine, University of Chicago, Chicago, Illinois.
  2. Matthew M Churpek: Section of Pulmonary and Critical Care Medicine, University of Wisconsin School of Medicine and Public Health, Madison.
  3. Kyle A Carey: Section of Hospital Medicine, University of Chicago, Chicago, Illinois.
  4. Zhenqui Lin: Section of Cardiovascular Medicine, Yale School of Medicine, Yale University, New Haven, Connecticut.
  5. Chenxi Huang: Section of Cardiovascular Medicine, Yale School of Medicine, Yale University, New Haven, Connecticut.
  6. Jonathan M Siner: Section of Pulmonary, Critical Care and Sleep Medicine, Yale School of Medicine, New Haven, Connecticut.
  7. Jennifer Johnson: Care Signature, Yale New Haven Health, New Haven, Connecticut.
  8. Harlan M Krumholz: Section of Cardiovascular Medicine, Yale School of Medicine, Yale University, New Haven, Connecticut.
  9. Deborah J Rhodes: Section of General Internal Medicine, Yale School of Medicine, New Haven, Connecticut.

Abstract

Importance: Early warning decision support tools to identify clinical deterioration in the hospital are widely used, but there is little information on their comparative performance.
Objective: To compare 3 proprietary artificial intelligence (AI) early warning scores and 3 publicly available simple aggregated weighted scores.
Design, Setting, and Participants: This retrospective cohort study was performed at 7 hospitals in the Yale New Haven Health System. All consecutive adult medical-surgical ward hospital encounters between March 9, 2019, and November 9, 2023, were included.
Exposures: Simultaneous Epic Deterioration Index (EDI), Rothman Index (RI), eCARTv5 (eCART), Modified Early Warning Score (MEWS), National Early Warning Score (NEWS), and NEWS2 scores.
Main Outcomes and Measures: Clinical deterioration, defined as a transfer from ward to intensive care unit or death within 24 hours of an observation.
Results: Of the 362 926 patient encounters (median patient age, 64 [IQR, 47-77] years; 200 642 [55.3%] female), 16 693 (4.6%) experienced a clinical deterioration event. eCART had the highest area under the receiver operating characteristic curve at 0.895 (95% CI, 0.891-0.900), followed by NEWS2 at 0.831 (95% CI, 0.826-0.836), NEWS at 0.829 (95% CI, 0.824-0.835), RI at 0.828 (95% CI, 0.823-0.834), EDI at 0.808 (95% CI, 0.802-0.812), and MEWS at 0.757 (95% CI, 0.750-0.764). After matching scores at the moderate-risk sensitivity level for a NEWS score of 5, overall positive predictive values (PPVs) ranged from a low of 6.3% (95% CI, 6.1%-6.4%) for an EDI score of 41 to a high of 17.3% (95% CI, 16.9%-17.8%) for an eCART score of 94. Matching scores at the high-risk specificity of a NEWS score of 7 yielded overall PPVs ranging from a low of 14.5% (95% CI, 14.0%-15.2%) for an EDI score of 54 to a high of 23.3% (95% CI, 22.7%-24.2%) for an eCART score of 97. The moderate-risk thresholds provided a median of at least 20 hours of lead time for all the scores. Median lead time at the high-risk threshold was 11 (IQR, 0-69) hours for eCART, 8 (IQR, 0-63) hours for NEWS, 6 (IQR, 0-62) hours for NEWS2, 5 (IQR, 0-56) hours for MEWS, 1 (IQR, 0-39) hour for EDI, and 0 (IQR, 0-42) hours for RI.
Conclusions and Relevance: In this cohort study of inpatient encounters, eCART outperformed the other AI and non-AI scores, identifying more deteriorating patients with fewer false alarms and sufficient time to intervene. NEWS, a non-AI, publicly available early warning score, significantly outperformed EDI. Given the wide variation in accuracy, additional transparency and oversight of early warning tools may be warranted.

References

  1. J Hosp Med. 2014 Feb;9(2):116-9 [PMID: 24357519]
  2. BMJ. 2015 Jan 07;350:g7594 [PMID: 25569120]
  3. JAMA Intern Med. 2023 Dec 1;183(12):1401-1402 [PMID: 37812411]
  4. Crit Care Med. 2022 Sep 1;50(9):1339-1347 [PMID: 35452010]
  5. N Engl J Med. 2020 Nov 12;383(20):1951-1960 [PMID: 33176085]
  6. JAMA Intern Med. 2023 Dec 1;183(12):1399-1401 [PMID: 37812404]
  7. J Gen Intern Med. 2003 Feb;18(2):77-83 [PMID: 12542581]
  8. JAMA. 2023 Dec 19;330(23):2275-2284 [PMID: 38112814]
  9. J Hosp Med. 2016 Nov;11(11):757-762 [PMID: 27352032]
  10. Clin Med (Lond). 2022 Nov;22(6):518-521 [PMID: 36427897]
  11. BMJ. 2020 May 20;369:m1501 [PMID: 32434791]
  12. Crit Care Med. 2016 Jan;44(1):54-63 [PMID: 26457753]
  13. JAMA Netw Open. 2023 Jul 3;6(7):e2324176 [PMID: 37486632]
  14. Int J Med Inform. 2022 Sep;165:104828 [PMID: 35780651]
  15. Resuscitation. 2013 Apr;84(4):465-70 [PMID: 23295778]
  16. QJM. 2001 Oct;94(10):521-6 [PMID: 11588210]
  17. J Gen Intern Med. 2024 Jan;39(1):27-35 [PMID: 37528252]
  18. J Biomed Inform. 2016 Dec;64:10-19 [PMID: 27658885]
  19. Ann Am Thorac Soc. 2021 Jul;18(7):1116-1117 [PMID: 34242149]
  20. Resuscitation. 2010 Aug;81(8):932-7 [PMID: 20637974]
  21. Resuscitation. 2016 May;102:1-5 [PMID: 26898412]
  22. Med Care. 2005 Nov;43(11):1130-9 [PMID: 16224307]
  23. Crit Care Med. 2016 Feb;44(2):368-74 [PMID: 26771782]
  24. Acad Radiol. 2022 Apr;29(4):559-566 [PMID: 34969610]
  25. JAMA Intern Med. 2024 May 1;184(5):557-562 [PMID: 38526472]
  26. J Am Heart Assoc. 2018 Jun 26;7(13): [PMID: 29945914]
  27. J Crit Care. 2017 Oct;41:86-90 [PMID: 28500920]
  28. Crit Care Med. 2023 Jun 1;51(6):775-786 [PMID: 36927631]
  29. Biometrics. 1988 Sep;44(3):837-45 [PMID: 3203132]
  30. Patient Saf Surg. 2014 Jun 27;8:29 [PMID: 25093041]

MeSH Term

Humans
Female
Middle Aged
Male
Early Warning Score
Artificial Intelligence
Retrospective Studies
Aged
Clinical Deterioration
Intensive Care Units
ROC Curve
Hospital Mortality

Word Cloud

Created with Highcharts 10.0.0095%CIscoreshoursscoreEDIeCARTNEWSIQREarlywarningdeteriorationearlyencountersRIWarningMEWSNEWS263%timetoolsclinicalhospital3AIpubliclyavailablecohortstudy7ward9IndexScorepatientmedianmoderate-risk5overallPPVslowhighhigh-risk142%leadoutperformednon-AIImportance:decisionsupportidentifywidelyusedlittleinformationcomparativeperformanceObjective:compareproprietaryartificialintelligencesimpleaggregatedweightedDesignSettingParticipants:retrospectiveperformedhospitalsYaleNewHavenHealthSystemconsecutiveadultmedical-surgicalMarch2019November2023includedExposures:SimultaneousEpicDeteriorationRothmaneCARTv5ModifiedNationalMainOutcomesMeasures:Clinicaldefinedtransferintensivecareunitdeathwithin24observationResults:362 926age64[IQR47-77]years200 642[553%]female16 69346%experiencedeventhighestareareceiveroperatingcharacteristiccurve895891-0900followed831826-0836829824-0835828823-0834808802-0812757750-0764matchingsensitivitylevelpositivepredictivevaluesranged1%-64%4117169%-178%94Matchingspecificityyieldedranging5%0%-155423227%-2497thresholdsprovidedleast20Medianthreshold110-6980-630-620-5610-39hour0-42ConclusionsRelevance:inpatientidentifyingdeterioratingpatientsfewerfalsealarmssufficientintervenesignificantlyGivenwidevariationaccuracyadditionaltransparencyoversightmaywarrantedScoresWithoutArtificialIntelligence

Similar Articles

Cited By