Methods Inf Med 2016; 55(04): 373-380
DOI: 10.3414/ME15-02-0019
Focus Theme – Original Articles
Schattauer GmbH

Automated Classification of Selected Data Elements from Free-text Diagnostic Reports for Clinical Research[*]

Martin Löpprich**
1   Institute of Medical Biometry and Informatics, Heidelberg University, Heidelberg, Germany
,
Felix Krauss**
1   Institute of Medical Biometry and Informatics, Heidelberg University, Heidelberg, Germany
,
Matthias Ganzinger
1   Institute of Medical Biometry and Informatics, Heidelberg University, Heidelberg, Germany
,
Karsten Senghas
1   Institute of Medical Biometry and Informatics, Heidelberg University, Heidelberg, Germany
,
Stefan Riezler
2   Department of Computational Linguistics, Heidelberg University, Heidelberg, Germany
3   Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, Heidelberg, Germany
,
Petra Knaup
1   Institute of Medical Biometry and Informatics, Heidelberg University, Heidelberg, Germany
› Author Affiliations
Fundings The Multiple Myeloma disease registry has been funded by the Dietmar-Hopp-Stiftung, Walldorf, Germany. CLIOMMICS is funded by the German Ministry of Education and Research within the e:Med initiative. Grant ID: 01ZX1309A.
Further Information

Publication History

received: 15 December 2015

accepted in revised form: 25 April 2016

Publication Date:
08 January 2018 (online)

Summary

Objectives: In the Multiple Myeloma clinical registry at Heidelberg University Hospital, most data are extracted from discharge letters. Our aim was to analyze if it is possible to make the manual documentation process more efficient by using methods of natural language processing for multiclass classification of free-text diagnostic reports to automatically document the diagnosis and state of disease of myeloma patients. The first objective was to create a corpus consisting of free-text diagnosis paragraphs of patients with multiple myeloma from German diagnostic reports, and its manual annotation of relevant data elements by documentation specialists. The second objective was to construct and evaluate a framework using different NLP methods to enable automatic multiclass classification of relevant data elements from free-text diagnostic reports.

Methods: The main diagnoses paragraph was extracted from the clinical report of one third randomly selected patients of the multiple myeloma research database from Heidelberg University Hospital (in total 737 selected patients). An EDC system was setup and two data entry specialists performed independently a manual documentation of at least nine specific data elements for multiple myeloma characterization. Both data entries were compared and assessed by a third specialist and an annotated text corpus was created. A framework was constructed, consisting of a self-developed package to split multiple diagnosis sequences into several subsequences, four different preprocessing steps to normalize the input data and two classifiers: a maximum entropy classifier (MEC) and a support vector machine (SVM). In total 15 different pipelines were examined and assessed by a ten-fold cross-validation, reiterated 100 times. For quality indication the average error rate and the average F1-score were conducted. For significance testing the approximate randomization test was used.

Results: The created annotated corpus consists of 737 different diagnoses paragraphs with a total number of 865 coded diagnosis. The dataset is publicly available in the supplementary online files for training and testing of further NLP methods. Both classifiers showed low average error rates (MEC: 1.05; SVM: 0.84) and high F1-scores (MEC: 0.89; SVM: 0.92). However the results varied widely depending on the classified data ele -ment. Preprocessing methods increased this effect and had significant impact on the classification, both positive and negative. The automatic diagnosis splitter increased the average error rate significantly, even if the F1-score decreased only slightly.

Conclusions: The low average error rates and high average F1-scores of each pipeline demonstrate the suitability of the investigated NPL methods. However, it was also shown that there is no best practice for an automatic classification of data elements from free-text diagnostic reports.

* Supplementary material published on our web-site http://dx.doi.org/10.3414/ME15-02-0019


** These authors contributed equally to this work


 
  • References

  • 1 Yoder RD, Swearingen DR, Schenthal JE, Sweeney JW, Nettleton WJ. An Automated Clinical Information System. Methods Inf Med 1964; 3 (Suppl. 02) 45-50.
  • 2 Brigl B, Ringleb P, Steiner T, Mann G, Leiner F, Grau A. et al. Multiple Verwendbarkeit Klinischer Dokumentationen am Beispiel eines wissensbasierten klinischen Arbeitsplatzsystems in der Neurologie. Informatik, Biometrie und Epidemiologie in Medizin und Biologie 1995; 26 (Suppl. 03) 240-9.
  • 3 Thomson R. DILEMMA: Decision Support in Primary Care, Oncology and Shared Care. In: Laires MF. et al., editors. Health in the New Communications Age. Amsterdam: IOS Press; 1995. p. 208-217.
  • 4 Georgiou A, Pearson M. The role of health informatics in clinical audit: part of the problem or key to the solution?. J Eval Clin Pract 2002; 8 (Suppl. 02) 183-8.
  • 5 Lazarus R, Kleinman K, Dashevsky I, Adams C, Kludt P, DeMaria A. et al. Use of Automated Ambulatory-Care Encounter Records for Detection of Acute Illness Clusters, Including Potential Bioterrorism Events. Emerg Infect Dis 2002; 8 (Suppl. 08) 753-60.
  • 6 Lewsey JD, Leyland AH, Murray GD, Boddy FA. Using routine data to complement and enhance the results of randomised controlled trials. Health Technol Assess 2000; 4 (Suppl. 22) 1-55.
  • 7 Herzberg S, Rahbar K, Stegger L, Schäfers M, Dugas M. Concept and Implementation of a Single Source Information System in Nuclear Medicine for Myocardial Scintigraphy (SPECT-CT data). Appl Clin Inform 2010; 1 (Suppl. 01) 50-67.
  • 8 Holm MB, Rogers JC, Burgio LD, McDowell BJ. Observational data collection using computer and manual methods: which informs best?. Top Health Inf Manage 1999; 19 (Suppl. 03) 15-25.
  • 9 Dugas M, Breil B, Thiemann V, Lechtenbörger J, Vossen G. Single source information systems to connect patient care and clinical research. Stud Health Technol Inform 2009; 150: 61-5.
  • 10 Prokosch HU, Ganslandt T. Perspectives for medical informatics. Reusing the electronic medical record for clinical research. Methods Inf Med 2009; 48 (Suppl. 01) 38-44.
  • 11 Ammenwerth E, Spötl H. The time needed for clinical documentation versus direct patient care. A work-sampling analysis of physicians’ activities. Methods Inf Med 2009; 48 (Suppl. 01) 84-91.
  • 12 Embi PJ, Jain A, Clark J, Bizjack S, Hornung R, Harris CM. Effect of a clinical trial alert system on physician participation in trial recruitment. Arch Intern Med 2005; 165 (Suppl. 19) 2272-7.
  • 13 Dugas M, Lange M, Berdel WE, Müller-Tidow C. Workflow to improve patient recruitment for clinical trials within hospital information systems – a case-study. Trials 2008; 9: 2.
  • 14 Williams JG, Cheung WY, Cohen DR, Hutchings HA, Longo MF, Russell IT. Can randomised trials rely on existing electronic data? A feasibility study to explore the value of routine data in health technology assessment. Health Technol Assess 2003; 7 (Suppl. 26) iii, v-x, 1-117.
  • 15 Sebastiani F. Machine learning in automated text categorization. ACM Computing Surveys 2002
  • 16 Friedman C, Shagina L, Lussier Y, Hripcsak G. Automated encoding of clinical documents based on natural language processing. J Am Med Inform Assoc 2004; 11 (Suppl. 05) 392-402.
  • 17 Hanauer DA, Miela G, Chinnaiyan AM, Chang AE, Blayney DW. The registry case finding engine: an automated tool to identify cancer cases from unstructured, free-text pathology reports and clinical notes. J Am Coll Surg 2007; 205 (Suppl. 05) 690-7.
  • 18 Jouhet V, Defossez G, Burgun A, Le Beux P, Levil-lain P, Ingrand P. et al. Automated classification of free-text pathology reports for registration of incident cases of cancer. Methods Inf Med 2012; 51 (Suppl. 03) 242-51.
  • 19 Buckley JM, Coopey SB, Sharko J, Polubriaginof F, Drohan B, Belli AK. et al. The feasibility of using natural language processing to extract clinical information from breast pathology reports. J Pathol Inform 2012; 3: 23.
  • 20 Elkins JS, Friedman C, Boden-Albala B, Sacco RL, Hripcsak G. Coding neuroradiology reports for the Northern Manhattan Stroke Study: a comparison of natural language processing and manual review. Comput Biomed Res 2000; 33 (Suppl. 01) 1-10.
  • 21 Yadav K, Sarioglu E, Smith M, Choi H. Automated outcome classification of emergency department computed tomography imaging reports. Acad Emerg Med 2013; 20 (Suppl. 08) 848-54.
  • 22 Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JF. Extracting information from textual documents in the electronic health record: a review of recent research. Yearb Med Inform 2008; 128-44.
  • 23 Ganzinger M, Gietzelt M, Karmen C, Firnkorn D, Knaup P. An IT Architecture for Systems Medicine. Stud Health Technol Inform 2015; 210: 185-9.
  • 24 Kaatsch P, Spix C, Hentschel S, Katalinic A, Luttmann S, Stegmaier C. et al. Krebs in Deutschland 2009/2010. 9th ed. Robert Koch-Institut, Gesellschaft der epidemiologischen Krebsregister in Deutschland e.V., editors. Berlin: 2013
  • 25 Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap) – a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform 2009; 42 (Suppl. 02) 377-81.
  • 26 MALLET: A Machine Learning for Language Toolkit. University of Massachusetts Amherst 2002 Available from: http://mallet.cs.umass.edu.
  • 27 OpenNLP. 2013 Available from: https://opennlp.apache.org.
  • 28 Faessler E, Hellrich J, Hahn U. Disclose Models, Hide the Data – How to Make Use of Confidential Corpora without Seeing Sensitive Raw Data. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14). 2014. p. 4230-4237.
  • 29 Wermter J, Hahn U. An Annotated German-Language Medical Text Corpus as Language Resource. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04). 2004. p. 473-6.
  • 30 Hunspell. 2014 Available from: http://hunspell.sourceforge.net.
  • 31 OpenNLP_1.5.1-German-Chunker-Tiger-Corps07.zip: OpenNLP German Chunker Tiger Corpus; 2011 Available from: http://gromgull.net/blog/2010/01/noun-phrase-chunking-for-the-awful-german-language/.
  • 32 Eric W. Noreen. Computer Intensive Methods for Testing Hypotheses. An Introduction. New York: Wiley; 1989
  • 33 Riezler S, Maxwell JT. On Some Pitfalls in Automatic Evaluation and Significance Testing for MT. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. Ann Arbor, Michigan: Association for Computational Linguistics; 2005. p. 57-64.