Can Patients with Dementia Be Identified in Primary Care Electronic Medical Records Using Natural Language Processing?

Laura C Maclagan, Mohamed Abdalla, Daniel A Harris, Therese A Stukel, Branson Chen, Elisa Candido, Richard H Swartz, Andrea Iaboni, R Liisa Jaakkimainen, Susan E Bronskill
Author Information
  1. Laura C Maclagan: ICES, G1-06, 2075 Bayview Avenue, Toronto, M4N 3M5 Canada.
  2. Mohamed Abdalla: Department of Computer Science, University of Toronto, Toronto, Canada.
  3. Daniel A Harris: Division of Epidemiology, Dalla Lana School of Public Health, University of Toronto, Toronto, Canada.
  4. Therese A Stukel: ICES, G1-06, 2075 Bayview Avenue, Toronto, M4N 3M5 Canada.
  5. Branson Chen: ICES, G1-06, 2075 Bayview Avenue, Toronto, M4N 3M5 Canada.
  6. Elisa Candido: ICES, G1-06, 2075 Bayview Avenue, Toronto, M4N 3M5 Canada.
  7. Richard H Swartz: ICES, G1-06, 2075 Bayview Avenue, Toronto, M4N 3M5 Canada.
  8. Andrea Iaboni: KITE Research Institute, Toronto Rehabilitation Institute, University Health Network, Toronto, Canada.
  9. R Liisa Jaakkimainen: ICES, G1-06, 2075 Bayview Avenue, Toronto, M4N 3M5 Canada.
  10. Susan E Bronskill: ICES, G1-06, 2075 Bayview Avenue, Toronto, M4N 3M5 Canada. ORCID

Abstract

Dementia and mild cognitive impairment can be underrecognized in primary care practice and research. Free-text fields in electronic medical records (EMRs) are a rich source of information which might support increased detection and enable a better understanding of populations at risk of dementia. We used natural language processing (NLP) to identify dementia-related features in EMRs and compared the performance of supervised machine learning models to classify patients with dementia. We assembled a cohort of primary care patients aged 66 + years in Ontario, Canada, from EMR notes collected until December 2016: 526 with dementia and 44,148 without dementia. We identified dementia-related features by applying published lists, clinician input, and NLP with word embeddings to free-text progress and consult notes and organized features into thematic groups. Using machine learning models, we compared the performance of features to detect dementia, overall and during time periods relative to dementia case ascertainment in health administrative databases. Over 900 dementia-related features were identified and grouped into eight themes (including symptoms, social, function, cognition). Using notes from all time periods, LASSO had the best performance (F1 score: 77.2%, sensitivity: 71.5%, specificity: 99.8%). Model performance was poor when notes written before case ascertainment were included (F1 score: 14.4%, sensitivity: 8.3%, specificity 99.9%) but improved as later notes were added. While similar models may eventually improve recognition of cognitive issues and dementia in primary care EMRs, our findings suggest that further research is needed to identify which additional EMR components might be useful to promote early detection of dementia.
Supplementary Information: The online version contains supplementary material available at 10.1007/s41666-023-00125-6.

Keywords

References

  1. Alzheimers Dement. 2020 Mar 10;: [PMID: 32157811]
  2. Alzheimer Dis Assoc Disord. 2009 Oct-Dec;23(4):306-14 [PMID: 19568149]
  3. Aging Ment Health. 2011 Nov;15(8):978-84 [PMID: 21777080]
  4. PLoS One. 2015 Sep 03;10(9):e0136181 [PMID: 26334524]
  5. Arch Intern Med. 2000 Oct 23;160(19):2964-8 [PMID: 11041904]
  6. J Biomed Inform. 2019;100S:100057 [PMID: 34384583]
  7. BMC Med. 2016 Jan 21;14:6 [PMID: 26797096]
  8. AMIA Annu Symp Proc. 2018 Dec 05;2018:1056-1065 [PMID: 30815148]
  9. BMC Med Inform Decis Mak. 2015 Apr 17;15:31 [PMID: 25886580]
  10. Qual Life Res. 2016 Oct;25(10):2619-2632 [PMID: 27052421]
  11. Drugs Aging. 2013 Sep;30(9):667-76 [PMID: 23775551]
  12. Acta Psychiatr Scand. 2011 Sep;124(3):165-83 [PMID: 21668424]
  13. Alzheimers Dement. 2020 Mar;16(3):531-540 [PMID: 31859230]
  14. Can J Cardiol. 2010 Aug-Sep;26(7):e225-8 [PMID: 20847968]
  15. J Alzheimers Dis. 2018;61(1):185-193 [PMID: 29103033]
  16. BMC Geriatr. 2017 Oct 25;17(1):248 [PMID: 29070036]
  17. BMC Med Inform Decis Mak. 2017 Feb 28;17(1):24 [PMID: 28241760]
  18. Ment Health Fam Med. 2013 Sep;10(3):143-51 [PMID: 24427181]
  19. Lancet Neurol. 2019 Jan;18(1):88-106 [PMID: 30497964]
  20. Gerontol Geriatr Med. 2020 Sep 24;6:2333721420959861 [PMID: 33029550]
  21. Nat Rev Neurol. 2010 Jun;6(6):318-26 [PMID: 20498679]
  22. J Gen Intern Med. 2005 Jul;20(7):572-7 [PMID: 16050849]
  23. Int J Geriatr Psychiatry. 2019 Mar;34(3):420-431 [PMID: 30430642]
  24. JMIR Med Inform. 2019 Mar 26;7(1):e13039 [PMID: 30862607]
  25. BMC Med Inform Decis Mak. 2019 Jul 9;19(1):128 [PMID: 31288818]
  26. Can Fam Physician. 2014 May;60(5):457-65 [PMID: 24829010]
  27. JMIR Med Inform. 2020 Jun 3;8(6):e17819 [PMID: 32490841]
  28. Am J Manag Care. 2014;20(1):e15-21 [PMID: 24669409]
  29. J Biomed Inform. 2018 Nov;87:12-20 [PMID: 30217670]
  30. JMIR Med Inform. 2019 Apr 27;7(2):e12239 [PMID: 31066697]
  31. BMC Med Inform Decis Mak. 2015 Aug 13;15:67 [PMID: 26268511]
  32. Int J Popul Data Sci. 2021 Sep 10;6(1):1650 [PMID: 34541337]
  33. Stat Med. 2019 Sep 20;38(21):4051-4065 [PMID: 31270850]
  34. Can J Cardiol. 2013 Nov;29(11):1388-94 [PMID: 24075778]
  35. J Am Med Inform Assoc. 2018 Sep 1;25(9):1206-1212 [PMID: 29947805]
  36. Degener Neurol Neuromuscul Dis. 2019 Dec 24;9:123-130 [PMID: 31920420]
  37. Alzheimers Dement (Amst). 2018 Aug 11;10:519-535 [PMID: 30364671]
  38. Alzheimers Dement (N Y). 2019 Oct 08;5:563-569 [PMID: 31646170]
  39. J Alzheimers Dis. 2016 Aug 10;54(1):337-49 [PMID: 27567819]
  40. PLoS Med. 2017 Mar 7;14(3):e1002249 [PMID: 28267802]
  41. J Am Med Inform Assoc. 2016 Sep;23(5):1007-15 [PMID: 26911811]
  42. BMJ Open. 2017 Jan 17;7(1):e012012 [PMID: 28096249]

Word Cloud

Created with Highcharts 10.0.0dementiafeaturesnotescareperformanceDementiaprimaryEMRsdementia-relatedmodelsUsinghealthcognitiveresearchrecordsmightdetectionlanguageprocessingNLPidentifycomparedmachinelearningpatientsEMRidentifiedtimeperiodscaseascertainmentF1score:sensitivity:99PrimaryElectronicNaturalmildimpairmentcanunderrecognizedpracticeFree-textfieldselectronicmedicalrichsourceinformationsupportincreasedenablebetterunderstandingpopulationsriskusednaturalsupervisedclassifyassembledcohortaged66 + yearsOntarioCanadacollectedDecember2016:52644148withoutapplyingpublishedlistsclinicianinputwordembeddingsfree-textprogressconsultorganizedthematicgroupsdetectoverallrelativeadministrativedatabases900groupedeightthemesincludingsymptomssocialfunctioncognitionLASSObest772%715%specificity:8%Modelpoorwrittenincluded144%83%specificity9%improvedlateraddedsimilarmayeventuallyimproverecognitionissuesfindingssuggestneededadditionalcomponentsusefulpromoteearlySupplementaryInformation:onlineversioncontainssupplementarymaterialavailable101007/s41666-023-00125-6CanPatientsIdentifiedCareMedicalRecordsLanguageProcessing?Artificialintelligence

Similar Articles

Cited By