Relation Extraction from Clinical Narratives Using Pre-trained Language Models.

Qiang Wei, Zongcheng Ji, Yuqi Si, Jingcheng Du, Jingqi Wang, Firat Tiryaki, Stephen Wu, Cui Tao, Kirk Roberts, Hua Xu
Author Information
  1. Qiang Wei: School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
  2. Zongcheng Ji: School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
  3. Yuqi Si: School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
  4. Jingcheng Du: School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
  5. Jingqi Wang: School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
  6. Firat Tiryaki: School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
  7. Stephen Wu: School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
  8. Cui Tao: School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
  9. Kirk Roberts: School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
  10. Hua Xu: School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.

Abstract

Natural language processing (NLP) is useful for extracting information from clinical narratives, and both traditional machine learning methods and more-recent deep learning methods have been successful in various clinical NLP tasks. These methods often depend on traditional word embeddings that are outputs of language models (LMs). Recently, methods that are directly based on pre-trained language models themselves, followed by fine-tuning on the LMs (e.g. the Bidirectional Encoder Representations from Transformers (BERT)), have achieved state-of-the-art performance on many NLP tasks. Despite their success in the open domain and biomedical literature, these pre-trained LMs have not yet been applied to the clinical relation extraction (RE) task. In this study, we developed two different implementations of the BERT model for clinical RE tasks. Our results show that our tuned LMs outperformed previous state-of-the-art RE systems in two shared tasks, which demonstrates the potential of LM-based methods on the RE task.

References

  1. J Am Med Inform Assoc. 2018 Mar 1;25(3):331-336 [PMID: 29186491]
  2. BMC Bioinformatics. 2015;16 Suppl 10:S6 [PMID: 26201352]
  3. J Am Med Inform Assoc. 2013 Sep-Oct;20(5):806-13 [PMID: 23564629]
  4. J Am Med Inform Assoc. 2011 Sep-Oct;18(5):594-600 [PMID: 21846787]
  5. J Biomed Inform. 2017 Aug;72:85-95 [PMID: 28694119]
  6. J Am Med Inform Assoc. 2011 Sep-Oct;18(5):552-6 [PMID: 21685143]
  7. J Am Med Inform Assoc. 2020 Jan 1;27(1):13-21 [PMID: 31135882]
  8. Bioinformatics. 2007 Feb 1;23(3):365-71 [PMID: 17142812]
  9. AMIA Annu Symp Proc. 2018 Dec 05;2018:1524-1533 [PMID: 30815198]
  10. J Am Med Inform Assoc. 2020 Jan 1;27(1):3-12 [PMID: 31584655]
  11. Nat Rev Genet. 2012 May 02;13(6):395-405 [PMID: 22549152]
  12. BMC Med Inform Decis Mak. 2019 Jan 31;19(Suppl 1):22 [PMID: 30700301]
  13. J Biomed Inform. 2009 Oct;42(5):760-72 [PMID: 19683066]
  14. Sci Data. 2016 May 24;3:160035 [PMID: 27219127]
  15. J Am Med Inform Assoc. 2011 Sep-Oct;18(5):557-62 [PMID: 21565856]
  16. J Am Med Inform Assoc. 2019 Nov 1;26(11):1297-1304 [PMID: 31265066]
  17. J Am Med Inform Assoc. 2019 Mar 1;26(3):262-268 [PMID: 30590613]
  18. Bioinformatics. 2020 Feb 15;36(4):1234-1240 [PMID: 31501885]
  19. Stud Health Technol Inform. 2010;160(Pt 1):739-43 [PMID: 20841784]
  20. J Am Med Inform Assoc. 2018 Jan 1;25(1):93-98 [PMID: 29025149]

Grants

  1. R00 LM012104/NLM NIH HHS
  2. R01 LM010681/NLM NIH HHS
  3. U01 TR002062/NCATS NIH HHS
  4. U24 CA194215/NCI NIH HHS

MeSH Term

Datasets as Topic
Humans
Information Storage and Retrieval
Machine Learning
Narration
Natural Language Processing
Semantics

Word Cloud

Created with Highcharts 10.0.0methodsclinicaltasksLMsRElanguageNLPtraditionallearningmodelspre-trainedBERTstate-of-the-arttasktwoNaturalprocessingusefulextractinginformationnarrativesmachinemore-recentdeepsuccessfulvariousoftendependwordembeddingsoutputsRecentlydirectlybasedfollowedfine-tuningegBidirectionalEncoderRepresentationsTransformersachievedperformancemanyDespitesuccessopendomainbiomedicalliteratureyetappliedrelationextractionstudydevelopeddifferentimplementationsmodelresultsshowtunedoutperformedprevioussystemsshareddemonstratespotentialLM-basedRelationExtractionClinicalNarrativesUsingPre-trainedLanguageModels

Similar Articles

Cited By