Deriving Insights From Open-Ended Learner Feedback: An Exploration of Natural Language Processing Approaches.
Marta M Maslej, Kayle Donner, Anupam Thakur, Faisal Islam, Kenya A Costa-Dookhan, Sanjeev Sockalingam
Author Information
Marta M Maslej: Dr. Maslej: Staff Scientist, Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, and Assistant Professor, Department of Psychiatry, University of Toronto, Toronto, Canada. Mr. Donner: Research Methods Specialist, Office of Education, Centre for Addiction and Mental Health, Toronto, Canada. Dr. Thakur: Psychiatrist, Adult Neurodevelopmental Services, Centre for Addiction and Mental Health and Assistant Professor, Department of Psychiatry, University of Toronto, Toronto, Canada. Dr. Islam: Manager Evaluation & Quality Improvement, Office of Education, Centre for Addiction and Mental Health, Toronto, Canada. Dr. Costa-Dookhan: Resident Physician, Office of Education, Centre for Addiction and Mental Health, and Temerty Faculty of Medicine, University of Toronto, Toronto, Canada. Dr. Sockalingam: Senior Vice President-Education, Chief Medical Officer, Office of Education, Centre for Addiction and Mental Health, and Professor, Department of Psychiatry, University of Toronto, Toronto, Canada.
INTRODUCTION: Open-ended feedback from learners offers valuable insights for adapting continuing health education to their needs; however, this feedback is burdensome to analyze with qualitative methods. Natural language processing offers a potential solution, but it is unclear which methods provide useful insights. We evaluated natural language processing methods for analyzing open-ended feedback from continuing professional development training at a psychiatric hospital. METHODS: The data set consisted of survey responses from staff participants, which included two text responses on how participants intended to use the training ("intent to use"; n = 480) and other information they wished to share ("open-ended feedback"; n = 291). We analyzed "intent-to-use" responses with topic modeling, "open-ended feedback" with sentiment analysis, and both responses with large language model (LLM)-based clustering. We examined outputs of each approach to determine their value for deriving insights about the training. RESULTS: Our results indicated that because the "intent-to-use" responses were short and lacked diversity, topic modeling was not useful in differentiating content between the topics. For "open-ended feedback," sentiment scores did not accurately reflect the valence of responses. The LLM-based clustering approach generated meaningful clusters characterized by semantically similar words for both responses. DISCUSSION: LLMs may be a useful approach for deriving insights from learner feedback because they capture context, making them capable of distinguishing between responses that use similar words to convey different topics. Future directions involve exploring other methods involving LLMs, or examining how these methods fare on other data sets or types of learner feedback.
References
Balmer DF, Anderson H, West DC. Program evaluation in health professions education: an innovative approach guided by principles. Acad Med. 2023;98:204–208.
Chary M, Parikh S, Manini AF, et al. A review of natural language processing in medical education. West J Emerg Med. 2019;20:78–86.
Blei DM, Ng AY, Jordan MI. Latent Dirichlet allocation. J Mach Learn Res. 2003:993–1022.
Costa-Dookhan KA, Maslej MM, Donner K, et al. Twelve tips for Natural Language Processing in medical education program evaluation. Med Teach. 2024;46:1147–1151.
Featherstone C, Botha A. Sentiment analysis of the ICT4Rural Education teacher professional development course. 2015 IST-Africa Conference, 06-08 May 2015; IEEE; 2015:1–12.
Misuraca M, Scepi G, Spano M. Using Opinion Mining as an educational analytic: an integrated strategy for the analysis of students' feedback. Stud Educ Eval. 2021;68:100979.
Lu KJQ, Meaney C, Guo E, et al. Evaluating the applicability of existing lexicon-based sentiment analysis techniques on family medicine resident feedback field notes: retrospective cohort study. JMIR Med Educ. 2023;9:e41953.
Heath JK, Clancy CB, Pluta W, et al. Natural Language processing of learners' evaluations of attendings to identify professionalism lapses. Eval Health Prof. 2023;46:225–232.
Yilmaz Y, Jurado Nunez A, Ariaeinejad A, et al. Harnessing natural language processing to support decisions around workplace-based assessment: machine learning study of competency-based medical education. JMIR Med Educ. 2022;8:e30537.
Maimone C, Dolan BM, Green MM, et al. Using natural language processing to visualize narrative feedback in a medical student performance dashboard. Acad Med. 2024;99:1094–1098.
Devlin J, Chang MW, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding. arXiv [csCL], 2018. [Preprint]. http://arxiv.org/abs/1810.04805.
Radford A, Wu J, Child R, et al. Language models are unsupervised multitask learners. OpenAI Blog. 1. Available at: https://insightcivic.s3.us-east-1.amazonaws.com/language-models.pdf. Accessed January 17, 2024.
Wu X, Duan R, Ni J. Unveiling security, privacy, and ethical concerns of ChatGPT. J Inf Intelligence. 2024;2:102–115.
Kasneci E, Sessler K, Küchemann S, et al. ChatGPT for good? On opportunities and challenges of large language models for education. Learn Individual Differ. 2023;103:102274.
Shepherd J. Unlocking the future of nursing education and continuing professional development by embracing generative artificial intelligence and advanced language models. IJPS. 2023;10:4.
Maimone C, Dolan BM, Green MM, et al. Utilizing natural language processing of narrative feedback to develop a predictive model of pre-clerkship performance: lessons learned. Perspect Med Educ. 2023;12:141–148.
Grootendorst M. BERTopic: neural topic modeling with a class-based TF-IDF procedure. arXiv [csCL]. 2022. [Preprint]. Available at: http://arxiv.org/abs/2203.05794.
Rinker T. Sentimentr [computer software]. 2.9.0.; 2017. Available at: https://github.com/trinker/sentimentr. Accessed November 28, 2024.
R: A Language and Environment for Statistical Computing [computer software]. 4.1.1. R Core Team; 2023. Available at: https://www.r-project.org/. Accessed November 28, 2024.
BERT (Language Model). Wikipedia, 2024. Available at: https://en.wikipedia.org/wiki/BERT_(language_model). Accessed July 23, 2024.
Wang W, Wei F, Dong L, et al. MiniLM: deep self-attention distillation for task-agnostic compression of pre-trained transformers. Adv Neural Inf Process Syst. 2020:abs/2002.10957. Available at: https://proceedings.neurips.cc/paper/2020/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html. Accessed November 28, 2024.
McInnes L, Healy J, Melville J. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction, 2018. [Preprint]. Available at: https://arxiv.org/pdf/1802.03426.
Campello RJGB, Moulavi D, Sander J. Density-based clustering based on hierarchical density estimates. In: Pei J, Tseng VS, Cao L, et al, eds. Advances in Knowledge Discovery and Data Mining: PAKDD 2013. Lecture Notes in Computer Science, Vol 7819. Berlin, Heidelberg: Springer; 2013:160–172.
Grün B, Hornik K. Topicmodels: An R Package for Fitting Topic Models [computer software]. 02.-16; 2011. Available at: https://cran.r-project.org/web/packages/topicmodels/index.html. Accessed November 28, 2024.
Python Language Reference [computer software]. 3.8, 1995. Available at: http://www.python.org,PythonSoftwareFoundation. Accessed November 28, 2024.
BERTopic [Computer Software]. 0.16.3. Available at: https://github.com/MaartenGr/BERTopic. Accessed November 28, 2024.
Deriving insights from open-ended learner feedback. Open Sci Framework. 2024. Available at: https://osf.io/u67e5/?view_only=afd98f4ecfcc46b0bec4ae97133732a6. Accessed November 28, 2024.