Disambiguating Clinical Abbreviations Using a One-Fits-All Classifier Based on Deep Learning Techniques

Areej Jaber; Paloma Martínez

doi:10.1055/s-0042-1742388

RSS-Feed abonnieren

Bitte kopieren Sie die angezeigte URL und fügen sie dann in Ihren RSS-Reader ein.

https://www.thieme-connect.de/rss/thieme/de/10.1055-s-00035037.xml

Teilen / Bookmarken

Facebook X Linkedin Weibo

PDF herunterladen

CC BY-NC-ND 4.0 · Methods Inf Med 2022; 61(S 01): e28-e34
DOI: 10.1055/s-0042-1742388

Original Article

Disambiguating Clinical Abbreviations Using a One-Fits-All Classifier Based on Deep Learning Techniques

Areej Jaber

¹Applied Computing Department, Palestine Technical University - Kadoorie, Tulkarem, Palestine

²Department of Computer Science, Universidad Carlos III de Madrid, Leganés, Spain

,

Paloma Martínez

²Department of Computer Science, Universidad Carlos III de Madrid, Leganés, Spain

› Institutsangaben Funding This work has been supported by the Madrid Government (Comunidad de Madrid-Spain) under the Multiannual Agreement with UC3M in the line of Excellence of University Professors (EPUC3M17), and in the context of the V PRICIT (Regional Programme of Research and Technological Innovation) and Palestine Technical University - Kadoorie (Palestine). The work was also supported by the PID2020-116527RB-I00 project.

› Weitere Informationen

Abstract
Volltext
Referenzen

Lizenzen und Reprints

Abstract

Background Abbreviations are considered an essential part of the clinical narrative; they are used not only to save time and space but also to hide serious or incurable illnesses. Misreckoning interpretation of the clinical abbreviations could affect different aspects concerning patients themselves or other services like clinical support systems. There is no consensus in the scientific community to create new abbreviations, making it difficult to understand them. Disambiguate clinical abbreviations aim to predict the exact meaning of the abbreviation based on context, a crucial step in understanding clinical notes.

Objectives Disambiguating clinical abbreviations is an essential task in information extraction from medical texts. Deep contextualized representations models showed promising results in most word sense disambiguation tasks. In this work, we propose a one-fits-all classifier to disambiguate clinical abbreviations with deep contextualized representation from pretrained language models like Bidirectional Encoder Representation from Transformers (BERT).

Methods A set of experiments with different pretrained clinical BERT models were performed to investigate fine-tuning methods on the disambiguation of clinical abbreviations. One-fits-all classifiers were used to improve disambiguating rare clinical abbreviations.

Results One-fits-all classifiers with deep contextualized representations from Bioclinical, BlueBERT, and MS_BERT pretrained models improved the accuracy using the University of Minnesota data set. The model achieved 98.99, 98.75, and 99.13%, respectively. All the models outperform the state-of-the-art in the previous work of around 98.39%, with the best accuracy using the MS_BERT model.

Conclusion Deep contextualized representations via fine-tuning of pretrained language modeling proved its sufficiency on disambiguating clinical abbreviations; it could be robust for rare and unseen abbreviations and has the advantage of avoiding building a separate classifier for each abbreviation. Transfer learning can improve the development of practical abbreviation disambiguation systems.

Keywords

natural language processing - clinical abbreviations - pretrained language model - word sense disambiguation - electronic health record

Ethical Approval

No human subjects were involved in this project, and institutional review board approval was not required.

Publikationsverlauf

Eingereicht: 26. August 2021

Angenommen: 29. Oktober 2021

Artikel online veröffentlicht:
01. Februar 2022

© 2022. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/)

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

References
1 Grossman LV, Mitchell EG, Hripcsak G, Weng C, Vawdrey DK. A method for harmonization of clinical abbreviation and acronym sense inventories. J Biomed Inform 2018; 88: 62-69

Crossref PubMed Google Scholar
2 Holper S, Barmanray R, Colman B, Yates CJ, Liew D, Smallwood D. Ambiguous medical abbreviation study: challenges and opportunities. Intern Med J 2020; 50 (09) 1073-1078

Crossref PubMed Google Scholar
3 Sinha S, McDermott F, Srinivas G, Houghton PWJ. Use of abbreviations by healthcare professionals: what is the way forward?. Postgrad Med J 2011; 87 (1029): 450-452

Crossref PubMed Google Scholar
4 Yim WW, Yetisgen M, Harris WPKS, Kwan SW. Natural language processing in oncology: a review. JAMA Oncol 2016; 2 (06) 797-804

Crossref PubMed Google Scholar
5 Murff HJ, FitzHenry F, Matheny ME. et al. Automated identification of postoperative complications within an electronic medical record using natural language processing. JAMA 2011; 306 (08) 848-855

Crossref PubMed Google Scholar
6 Hanauer D, Aberdeen J, Bayer S. et al. Bootstrapping a de-identification system for narrative patient records: cost-performance tradeoffs. Int J Med Inform 2013; 82 (09) 821-831

Crossref PubMed Google Scholar
7 Jaber A, Martínez P. Disambiguating Clinical Abbreviations using Pre-trained Word Embeddings. In: Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies. Vol. 5. SCITEPRESS - Science and Technology Publications; 2021: 501-508

Google Scholar
8 Joopudi V, Dandala B, Devarakonda M. A convolutional route to abbreviation disambiguation in clinical text. J Biomed Inform 2018; 86: 71-78

Crossref PubMed Google Scholar
9 Li I, Yasunaga M, Nuzumlalı MY, Caraballo C, Mahajan S, Krumholz H, Radev D. A neural topic-attention model for medical term abbreviation disambiguation. 2019 arXiv preprint arXiv:1910.14076

PubMed Google Scholar
10 Navigli R. Word sense disambiguation: a survey. ACM Comput Surv 2009; 41 (02) 1-69

Crossref PubMed Google Scholar
11 Mihalcea R. Knowledge-Based Methods for WSD. In: Agirre E, Edmonds P. eds. Word Sense Disambiguation: Algorithms and Applications. Dordrecht: Springer Netherlands; 2006: 107-131

Google Scholar
12 Xu H, Wu Y, Elhadad N, Stetson PD, Friedman C. A new clustering method for detecting rare senses of abbreviations in clinical notes. J Biomed Inform 2012; 45 (06) 1075-1083

Crossref PubMed Google Scholar
13 Finley GP, Pakhomov SVS, McEwan R, Melton GB. Towards comprehensive clinical abbreviation disambiguation using machine-labeled training data. AMIA Annu Symp Proc 2017; 2016: 560-569

PubMed Google Scholar
14 Wu Y, Xu J, Zhang Y, Xu H. Clinical abbreviation disambiguation using neural word embeddings. In Proceedings of BioNLP 15. 2015: 171-176

PubMed Google Scholar
15 Màrquez L, Escudero G, Martínez D, Rigau G. Supervised corpus-based methods for WSD. In: Agirre E, Edmonds P. eds. Word Sense Disambiguation: Algorithms and Applications. Dordrecht; Springer Netherlands: 2006: 167-216

Google Scholar
16 Wang Y, Hou Y, Che W, Liu T. From static to dynamic word representations: a survey. Int J Mach Learn Cybern 2020; 11: 1611-1630

Crossref PubMed Google Scholar
17 Moon S, Pakhomov S, Melton GB. Automated disambiguation of acronyms and abbreviations in clinical texts: window and training size considerations. AMIA Annu Symp Proc 2012; 2012: 1310-1319

PubMed Google Scholar
18 Peters M, Neumann M, Iyyer M. et al. Deep Contextualized Word Representations. arXiv preprint 2018;arXiv:1802.05365

Crossref Google Scholar
19 Devlin J, Chang MW, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (Long and Short Papers) Vol. 1. 2019: 4171-4186

Google Scholar
20 Liu Y, Lapata M. Text summarization with pretrained encoders. CoRR. 2019: abs/1908.0

PubMed Google Scholar
21 Chalkidis I, Fergadiotis M, Malakasiotis P, Androutsopoulos I. Large-scale multi-label text classification on {EU} Legislation. CoRR. 2019: abs/1906.0

PubMed Google Scholar
22 Hakala K, Pyysalo S. Biomedical Named Entity Recognition with Multilingual {BERT}. In: Proceedings of The 5th Workshop on BioNLP Open Shared Tasks. Hong Kong, China: Association for Computational Linguistics; 2019: 56-61

Google Scholar
23 Gao Z, Feng A, Song X, Wu X. Target-dependent sentiment classification with BERT. IEEE Access 2019; 7: 154290-154299

Crossref PubMed Google Scholar
24 Laguna JY, Alberola V. Dictionary of medical acronyms, abbreviations and hospital discharge codification related terms. Ministry of Health Publications Center. 2003

PubMed Google Scholar
25 Moon S, Pakhomov S, Liu N, Ryan JO, Melton GB. A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources. J Am Med Inform Assoc 2014; 21 (02) 299-307

Crossref PubMed Google Scholar
26 Vaswani A, Shazeer N, Parmar N. et al. Attention is all you need. CoRR. 2017: abs/1706.0

PubMed Google Scholar
27 Jin Q, Liu J, Lu X. Deep contextualized biomedical abbreviation expansion. 2019 ; arXiv preprint arXiv:1906.03360

PubMed Google Scholar
28 Du J, Qi F, Sun M. Using BERT for word sense disambiguation. arXiv preprint 2019;arXiv:1909.08358

PubMed Google Scholar
29 Alsentzer E, Murphy JR, Boag W, Weng WH, Jin D, Naumann T, McDermott M. Publicly available clinical BERT embeddings. arXiv preprint 2019;arXiv:1904.03323

PubMed Google Scholar
30 Johnson AEW, Pollard TJ, Shen L. et al. MIMIC-III, a freely accessible critical care database. Sci Data 2016; 3: 160035

Crossref PubMed Google Scholar
31 Lee J, Yoon W, Kim S. et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2020; 36 (04) 1234-1240

Crossref PubMed Google Scholar
32 Peng Y, Yan S, Lu Z. Transfer learning in biomedical natural language processing: an evaluation of BERT and ELMo on ten benchmarking datasets. arXiv preprint 2019;arXiv:1906.05474

PubMed Google Scholar
33 MS-BERT. Accessed December 22, 2021: https://huggingface.co/NLP4H/ms_bert

PubMed
34 Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw 2015; 61: 85-117

Crossref PubMed Google Scholar
35 Agarap AF. Deep Learning using Rectified Linear Units (ReLU). arXiv preprint 2019;arXiv:1803.08375v2 [cs.NE]

PubMed Google Scholar
36 Kingma DP, Ba JL. Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings. 2015: 1-15

Google Scholar
37 Kashyap A, Burris H, Callison-Burch C, Boland MR. The CLASSE GATOR (CLinical Acronym SenSE disambiGuATOR): a method for predicting acronym sense from neonatal clinical notes. Int J Med Inform 2020; 137: 104101

Crossref PubMed Google Scholar
38 Adams G, Ketenci M, Bhave S, Perotte A, Elhadad N. Zero-shot clinical acronym expansion via Latent Meaning Cells. CoRR. 2020: abs/2010.0:12–40

PubMed Google Scholar
39 Kim Juyong, and, Gong Linyuan, and, Khim Justin, and Weiss, Jeremy C, . and. Ravikumar P. Improved Clinical Abbreviation Expansion via Non-Sense-Based Approaches. 2020 . Available at: http://proceedings.mlr.press/v136/kim20a.html

PubMed Google Scholar

RSS-Feed abonnieren

Teilen / Bookmarken

Disambiguating Clinical Abbreviations Using a One-Fits-All Classifier Based on Deep Learning Techniques

Abstract

Keywords

Ethical Approval

Publikationsverlauf

References