Introducing Information Extraction to Radiology Information Systems to Improve the Efficiency on Reading Reports
03 January 2019
03 June 2019
12 September 2019 (online)
Background Radiology reports are a permanent record of patient's health information often used in clinical practice and research. Reading radiology reports is common for clinicians and radiologists. However, it is laborious and time-consuming when the amount of reports to be read is large. Assisting clinicians to locate and assimilate the key information of reports is of great significance for improving the efficiency of reading reports. There are few studies on information extraction from Chinese medical texts and its application in radiology information systems (RIS) for efficiency improvement.
Objectives The purpose of this study was to explore methods for extracting, grouping, ranking, delivering, and displaying medical-named entities in radiology reports which can yield efficiency improvement in RISs.
Methods A total of 5,000 reports were obtained from two medical institutions for this study. We proposed a neural network model called Multi-Embedding-BGRU-CRF (bidirectional gated recurrent unit-conditional random field) for medical-named entity recognition and rule-based methods for entity grouping and ranking. Furthermore, a methodology for delivering and displaying entities in RISs was presented.
Results The proposed neural named entity recognition model has achieved a good F1 score of 95.88%. Entity ranking achieved a very high accuracy of 99.23%. The weakness of the system is the entity grouping approach which yield accuracy of 91.03%. The effectiveness of the overall solution was proved by an evaluation task performed by two clinicians based on the setup of actual clinical practice.
Conclusions The neural model shows great potential in extracting medical-named entities from radiology reports, especially for languages, that lack lexicons and natural language processing tools. The pipeline of extracting, grouping, ranking, delivering, and displaying medical-named entities could be a feasible solution to enhance RIS functionality by information extraction. The integration of information extraction and RIS has been demonstrated to be effective in improving the efficiency of reading radiology reports.
- 1 Society E. ; European Society of Radiology (ESR); Guidelines from the European Society of Radiology (ESR). Good practice for radiological reporting. Insights Imaging 2011; 2 (02) 93-96
- 2 Bosmans JML, Weyler JJ, De Schepper AM, Parizel PM. The radiology report as seen by radiologists and referring clinicians: results of the COVER and ROVER surveys. Radiology 2011; 259 (01) 184-195
- 3 Kahn Jr. CE, Langlotz CP, Burnside ES. , et al. Toward best practices in radiology reporting. Radiology 2009; 252 (03) 852-856
- 4 Ganeshan D, Duong PT, Probyn L. , et al. Structured Reporting in Radiology. Acad Radiol 2018; 25 (01) 66-73
- 5 Wang Y, Wang L, Rastegar-Mojarad M. , et al. Clinical information extraction applications: A literature review. J Biomed Inform 2018; 77: 34-49
- 6 Barbosa F, Traina AJ, Muglia VF. Meta-generalis: A novel method for structuring information from radiology reports. Appl Clin Inform 2016; 7 (03) 803-816
- 7 Lakhani P, Kim W, Langlotz CP. Automated detection of critical results in radiology reports. J Digit Imaging 2012; 25 (01) 30-36
- 8 Hassanpour S, Langlotz CP. Information extraction from multi-institutional radiology reports. Artif Intell Med 2016; 66: 29-39
- 9 Esuli A, Marcheggiani D, Sebastiani F. An enhanced CRFs-based system for information extraction from radiology reports. J Biomed Inform 2013; 46 (03) 425-435
- 10 Xu H. Research on medical entity extraction based on supervised learning. 2016
- 11 Wang HC, Zhao TJ. SVM-based biomedical name entity recognition. Harbin Gongcheng Daxue Xuebao/journal Harbin Eng Univ 2006; 27: 570-574
- 12 Yang J-F, Yu Q-B, Guan Y, Jiang Z-P. An overview of research on electronic medical record oriented named entity recognition and entity relation extraction. Zidonghua Xuebao/Acta Autom Sin 2014; 40 (08) 1537-1562
- 13 Bodenreider O. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res 2004; 32 (Database issue): D267-D270
- 14 Lipscomb CE. Medical subject headings (MeSH). Bull Med Libr Assoc 2000; 88 (03) 265-266
- 15 International Health Terminology Standards Development Organisation (IHTSDO). SNOMED CT Technical Implementation Guide; 2014
- 16 Langlotz CP. RadLex: a new method for indexing online educational materials. Radiographics 2006; 26 (06) 1595-1597
- 17 Savova GK, Masanz JJ, Ogren PV. , et al. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc 2010; 17 (05) 507-513
- 18 Aronson AR. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Symp 2001; 17-21
- 19 Friedman C, Hripcsak G, DuMouchel W, Johnson SB, Clayton PD. Natural language processing in an operational clinical information system. Nat Lang Eng 1995; 1 (01) 83-108
- 20 Lyu C, Chen B, Ren Y, Ji D. Long short-term memory RNN for biomedical named entity recognition. BMC Bioinformatics 2017; 18 (01) 462
- 21 Gao S, Young MT, Qiu JX. , et al. Hierarchical attention networks for information extraction from cancer pathology reports. J Am Med Inform Assoc 2017 (e-pub ahead of print); Doi:10.1093/jamia/ocx131
- 22 Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C. Neural Architectures for Named Entity Recognition. In: proceedings of NAACL-HLT 2016. San Diego, CA; 2016
- 23 Miao S, Xu T, Wu Y. , et al. Extraction of BI-RADS findings from breast ultrasound reports in Chinese using deep learning approaches. Int J Med Inform 2018; 119: 17-21
- 24 Stenetorp P, Pyysalo S, Topi G, Ohta T, Ananiadou S, Tsujii J. BRAT: a Web-based Tool for NLP-Assisted Text Annotation. In: proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. Avignon, France; 2012
- 25 Nandhakumar N, Sherkat E, Milios EE, Gu H, Butler M. Clinically significant information extraction from radiology reports. In: Proceedings of the 2017 ACM Symposium on Document Engineering–DocEng. New York, NY: ACM Press; 2017: 153-162
- 26 Ma X, Hovy E. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany; 2016
- 27 Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF models for sequence tagging. Available from: https://arxiv.org/pdf/1508.01991.pdf . Accessed June 20, 2019
- 28 Dai HJ, Lai PT, Chang YC, Tsai RTH. Enhancing of chemical compound and drug name recognition using representative tag scheme and fine-grained tokenization. J Cheminform 2015; 7 (Suppl. 1, Text mining for chemistry and the CHEMDNER track): S14
- 29 Ratinov L, Roth D. Design challenges and misconceptions in named entity recognition. In: Proceedings of the 13th Conference on Computational Natural Language Learning–CoNLL. Boulder, CO; 2009
- 30 Cho K, van Merrienboer B, Gulcehre C. , et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar; 2014
- 31 Mikolov T, Corrado G, Chen K, Dean J. Efficient estimation of word representations in vector space. Available from: https://arxiv.org/pdf/1301.3781.pdf . Accessed June 21, 2019
- 32 Levy O, Goldberg Y. Neural word embedding as implicit matrix factorization. Available from: https://papers.nips.cc/paper/5477-neural-word-embedding-as-implicit-matrix-factorization.pdf . Accessed June 20, 2019.
- 33 Globerson A, Chechik G, Pereira F, Tishby PN. Euclidean embedding of co-occurrence data. J Mach Learn Res 2007; 7: 2265-2295
- 34 Sun Y, Lin L, Yang N, Ji Z, Wang X. Radical-enhanced Chinese character embedding. In: Loo CK, Yap KS, Wong KW, Teoh A, Huang K. , eds. Neural Information Processing: 21st International Conference. Kuching, Malaysia: Springer; 2014: 279-286
- 35 Peng N, Dredze M. Named entity recognition for Chinese social media with jointly trained embeddings. In: proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal; 2015
- 36 Yin R, Wang Q, Li R, Li P, Wang B. Multi-granularity Chinese word embedding. In: proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin, TX; 2016
- 37 Jieba Chinese text segmentation.
- 38 Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 1994; 5 (02) 157-166
- 39 Graves A, Schmidhuber J. Framewise phoneme classification with bidirectional LSTM networks. In: Proceedings of the International Joint Conference on Neural Networks. 2005
- 40 Fielding RT, Taylor RN. Principled design of the modern Web architecture. ACM Trans Internet Technol 2002; 2 (02) 115-150
- 41 Gardner M, Grus J, Neumann M. , et al. AllenNLP: A Deep Semantic Natural Language Processing Platform. In: proceedings of workshop for NLP Open Source Software. Melbourne, Australia; 2018
- 42 Pawar S, Palshikar GK, Bhattacharyya P. Relation Extraction: A Survey. Available from: https://arxiv.org/pdf/1712.05191.pdf . Accessed June 21, 2019.
- 43 Lin Y, Shen S, Liu Z, Luan H, Sun M. Neural Relation Extraction with Selective Attention over Instances. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2016