Subscribe to RSS
Natural Language Processing Techniques for Extracting and Categorizing Finding Measurements in Narrative Radiology Reports
01 December 2014
accepted in revised form: 31 July 2015
19 December 2017 (online)
Background: Accumulating quantitative outcome parameters may contribute to constructing a healthcare organization in which outcomes of clinical procedures are reproducible and predictable. In imaging studies, measurements are the principal category of quantitative para meters.
Objectives: The purpose of this work is to develop and evaluate two natural language processing engines that extract finding and organ measurements from narrative radiology reports and to categorize extracted measurements by their “temporality“.
Methods: The measurement extraction engine is developed as a set of regular expressions. The engine was evaluated against a manually created ground truth. Automated categorization of measurement temporality is defined as a machine learning problem. A ground truth was manually developed based on a corpus of radiology reports. A maximum entropy model was created using features that characterize the measurement itself and its narrative context. The model was evaluated in a ten-fold cross validation protocol.
Results: The measurement extraction engine has precision 0.994 and recall 0.991. Accuracy of the measurement classification engine is 0.960.
Conclusions: The work contributes to machine understanding of radiology reports and may find application in software applications that process medical data.
Citation: Sevenster M, Buurman J, Liu P, Peters JF, Chang PJ. Natural language processing techniques for extracting and categorizing finding measurements in narrative radiology reports. Appl Clin Inform 2015; 6: 600–610
KeywordsNatural language processing - radiology report - measurement - maximum entropy - quantitative imaging
- 1 Sullivan DC. Imaging as a Quantitative Science. Radiology 2008; 248 (02) 328-332.
- 2 Sevenster M, Bozeman J, Cowhy A, Trost W. A natural language processing pipeline for pairing measurements uniquely across free-text CT reports. J Biomed Inf 2015; 53: 36-48.
- 3 Friedman C, Alderson PO, Austin JHM, Cimino JJ, Johnson SB. A general natural-language text processor for clinical radiology. J Am Med Inf. Assoc 1994; 1: 161-174.
- 4 Dreyer KJ, Kalra MK, Maher MM, Hurier AM, Asfaw BA, Schultz T, Halpern EF, Thrall JH. Application of recently developed computer algorithm for automatic classification of unstructured radiology reports: validation study. Radiology 2005; 234 (02) 323-329.
- 5 Chen ES, Hripcsak G, Xu H, Markatou M, Friedman C. Automated acquisition of disease drug knowledge from biomedical and clinical documents: an initial study. J Am Med Inf Assoc 2008; 15 (01) 87-98.
- 6 Meystre SM, Friedlin FJ, South BR, Shen S, Samore MH. Automatic de-identification of textual documents in the electronic health record: a review of recent research. BMC Med Res Methodol 2010; 10: 70.
- 7 Tian Z, Sun S, Eguale T, Rochefort C. Automated Extraction of VTE Events From Narrative Radiology Reports in Electronic Health Records: A Validation Study. Med Care; 2015
- 8 Elkin PL, Froehling D, Wahner-Roedler D, Trusko B, Welsh G, Ma H, Asatryan AX, Tokars JI, Rosen-bloom ST, Brown SH. NLP-based identification of pneumonia cases from free-text radiological reports. AMIA Annu Symp Proc 2008; 172-176.
- 9 Yetisgen-Yildiz M, Gunn ML, Xia F, Payne TH. A Text Processing Pipeline to Extract Recommendations from Radiology Reports. J Biomed Inf 2013; 46 (02) 3543-3562.
- 10 Asatryan A, Benoit S, Ma H, English R, Elkin P, Tokars J. Detection of pneumonia using free-text radiology reports in the BioSense system. Int J Med Inf 2011; 80 (01) 67-73.
- 11 Mabotuwana T, Qian Y, Sevenster M. Using image references in radiology reports to support enhanced report-to-image navigation. AMIA Annu Symp Proc 2013; 908-916.
- 12 Sevenster M, van Ommering R, Qian Y. Automatically Correlating Clinical Findings and Body Locations in Radiology Reports Using MedLEE. J Digit Imaging 2012; 25 (02) 240-249.
- 13 Friedman C. A broad-coverage natural language processing system. AMIA Annu Symp Proc 2000; 270-274.
- 14 Melton GB, Hripcsak G. Automated detection of adverse events using natural language processing of discharge summaries. J Am Med Inf Assoc 2005; 12 (04) 448-457.
- 15 He J, de Rijke M, Sevenster M, van Ommering R, Qian Y. Generating Links to Background Knowledge: A Case Study Using Narrative Radiology Reports. Conference on Information and Knowledge Management. 2011
- 16 Rochefort C, Verma A, Eguale T, Lee T, Buckeridge D. A novel method of adverse event detection can accurately identify venous thromboembolisms (VTEs) from narrative electronic health record data. J Am Med Inf Assoc 2015; 22 (01) 155-165.
- 17 World Health Organization.. WHO handbook for reporting results of cancer treatment. World Heal Geneva; Switzerland: 1979
- 18 Therasse P, Arbuck SG, Eisenhauer EA, Wanders J, Kaplan RS, Rubinstein L, Verweij J, Van Glabbeke M, van Oosterom AT, Christian MC, Gwyther SG. New guidelines to evaluate the response to treatment in solid tumors. J Natl Cancer Inst 2000; 92: 205-216.
- 19 Jaffe TA, Wickersham NW, Sullivan DC. Quantitative imaging in oncology patients: Part 1, radiology practice patterns at major U. S. cancer centers. Am J Roentgenol 2010; 195: 101-106.
- 20 Jaffe TA, Wickersham NW, Sullivan DC. Quantitative imaging in oncology patients: Part 2, oncologists’ opinions and expectations at major U. S. cancer centers. Am J Roentgenol 2010; 195 (01) 19-30.
- 21 Nigam K. Using maximum entropy for text classification. IJCAI-99 Workshop on Machine Learning for Information Filtering 1999; 61-67.
- 22 Carletta J. Assessing agreement on classification tasks: The kappa statistic. Comput Linguist 1996; 22: 249-254.
- 23 Sevenster M. Classifying measurements in dictated, free-text radiology reports. AIME Symp Proc 2013; 310-314.