Yearb Med Inform 2008; 17(01): 128-144
DOI: 10.1055/s-0038-1638592
Original Article
Georg Thieme Verlag KG Stuttgart

Extracting Information from Textual Documents in the Electronic Health Record: A Review of Recent Research

S. M. Meystre
1   Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, Utah, USA
,
G. K. Savova
2   Biomedical Informatics Research, Mayo Clinic College of Medicine, Rochester, Minnesota, USA
,
K. C. Kipper-Schuler
2   Biomedical Informatics Research, Mayo Clinic College of Medicine, Rochester, Minnesota, USA
,
J. F. Hurdle
1   Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, Utah, USA
› Author Affiliations
Further Information

Publication History

Publication Date:
07 March 2018 (online)

Summary

Objectives We examine recent published research on the extraction of information from textual documents in the Electronic Health Record (EHR).

Methods Literature review of the research published after 1995, based on PubMed, conference proceedings, and the ACM Digital Library, as well as on relevant publications referenced in papers already included.

Results 174 publications were selected and are discussed in this review in terms of methods used, pre-processing of textual documents, contextual features detection and analysis, extraction of information in general, extraction of codes and of information for decision-support and enrichment of the EHR, information extraction for surveillance, research, automated terminology management, and data mining, and de-identification of clinical text.

Conclusions Performance of information extraction systems with clinical text has improved since the last systematic review in 1995, but they are still rarely applied outside of the laboratory they have been developed in. Competitive challenges for information extraction from clinical text, along with the availability of annotated clinical text corpora, and further improvements in system performance are important factors to stimulate advances in this field and to increase the acceptance and usage of these systems in concrete clinical and biomedical research contexts.

 
  • References

  • 1 Spyns P. Natural language processing in medicine: an overview. Methods Inf Med 1996; Dec; 35 4-5 285-301.
  • 2 Cohen AM, Hersh WR. A survey of current work in biomedical text mining. Brief Bioinform 2005; Mar; 06 (01) 57-71.
  • 3 Zweigenbaum P, Demner-Fushman D, Yu H, Cohen KB. Frontiers of biomedical text mining: current progress. Brief Bioinform 2007; 358-75.
  • 4 DeJong GF. An Overview of the FRUMP System. In: Ringle WGLaMH, editor. Strategies for Natural Language Processing. Hillsdale, NJ: Lawrence Erlbaum; 1982: 149-76.
  • 5 Google. [cited 01/10/2008]. Available from: http://www.google.com.
  • 6 PubMed. [cited 01/10/2008]. Available from: http://www.ncbi.nlm.nih.gov/sites/entrez/.
  • 7 Carbonell JG, Hayes PJ. Natural Language Understanding. In: Shapiro SC. editor. Encyclopedia of Artificial Intelligence. Wiley; 1992: 660-77.
  • 8 Ananiadou S, McNaught J. Text Mining for Biology and Biomedicine: Artech House, Inc. 2006
  • 9 Hearst MA. Untangling text data mining. Proc 37th Annual meeting of the Association for Com putational Linguistics. College Park, MD: 1999: 3-10.
  • 10 Liu H, Lussier YA, Friedman C. Disambiguating ambiguous biomedical terms in biomedical narrative text: an unsupervised method. J Biomed Inform 2001; 249-61.
  • 11 Sager N, Friedman C, Chi E. The analysis and processing of clinical narrative. In: Salamon R, Blum B, Jørgensen M. editors. Medinfo 86. 1986. Amsterdam (Holland): Elsevier; o1986; 1101-5.
  • 12 Friedman C, Johnson SB, Forman B, Starren J. Architectural requirements for a multipurpose natural language processor in the clinical environment. Proc Annu Symp Comput Appl Med Care 1995; 347-51.
  • 13 Hripcsak G, Kuperman GJ, Friedman C. Extracting findings from narrative reports: software transferability and sources of physician disagreement. Methods Inf Med 1998; 1-7.
  • 14 Haug PJ, Ranum DL, Frederick PR. Computerized extraction of coded findings from free-text radiologic reports. Work in progress. Radiology 1990; Feb; 174 (02) 543-8.
  • 15 Haug PJ, Koehler S, Lau LM, Wang P, Rocha R, Huff SM. Experience with a mixed semantic/syntactic parser. Proc Annu Symp Comput Appl Med Care 1995; 284-8.
  • 16 McCray AT, Sponsler JL, Brylawski B, Browne AC. The role of lexical knowledge in biomedical text understanding. SCAMC 87; IEEE 1987; 103-7.
  • 17 Lindberg C. The Unified Medical Language System (UMLS) of the National Library of Medicine. Journal (American Medical Record Association) 1990; May; 61 (05) 40-2.
  • 18 Aronson AR. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Symp 2001; 17-21.
  • 19 McNaught J, Black WJ. Information extraction: the task. In: Ananiadou S, McNaught J. editors. Text Mining for Biology and Biomedicine. Artech House Books; 2006: 143-76.
  • 20 Hobbs JR. The generic information extraction system. Proc MUC-5. Baltimore, MD: Morgan Kaufmann; 1993: 87-92.
  • 21 Hobbs JR. Information extraction from biomedical text. J Biomed Inform 2002; Aug 35 (04) 260-4.
  • 22 Pakhomov S, Buntrock J, Duffy PH. High Throughput Modularized NLP System for Clinical Text 43rd Annual Meeting of the Association for Computational Linguistics; 2005. Ann Arbor, MI: 2005
  • 23 cancerText Information Extraction System (caTIES) website. [cited 01/10/2008];Available from: https://cabig.nci.nih.gov/tools/caties.
  • 24 Friedman C. A broad-coverage natural language processing system. Proc AMIA Symp 2000; 270-4.
  • 25 Friedman C, Shagina L, Lussier Y, Hripcsak G. Automated encoding of clinical documents based on natural language processing. J Am Med Inform Assoc 2004; 392-402.
  • 26 Xu H, Anderson K, Grann VR, Friedman C. Facilitating cancer research using natural language processing of pathology reports. Medinfo 2004; 565-72.
  • 27 Liu K, Mitchell KJ, Chapman WW, Crowley RS. Automating tissue bank annotation from pathology reports - comparison to a gold standard expert annotation set. AMIA Annu Symp Proc 2005; 460-4.
  • 28 Hahn U, Romacker M, Schulz S. Creating knowledge repositories from biomedical reports: the MEDSYNDIKATE text mining system. Pac Symp Biocomput 2002; 338-49.
  • 29 International Challenge: Classifying Clinical Free Text Using Natural Language Processing. [cited 01/10/2008]; Available from: http://www.computationalmedicine.org/challenge/index.php.
  • 30 Pestian JP, Brew C, Matykiewicz P, Hovermale DJ, Johnson N, Cohen KB. et al. A Shared Task Involving Multi-label Classication of Clinical Free Text. BioNLP 2007: Biological, translational, and clinical language processing. Prague, CZ: 2007
  • 31 i2b2 (Informatics for Integrating Biology and the Bedside) website. [cited 01/10/2008]; Available from: https://www.i2b2.org/.
  • 32 Uzuner O, Luo Y, Szolovits P. Evaluating the state-of-the-art in automatic de-identification. J Am Med Inform Assoc 2007; 550-63.
  • 33 Uzuner O, Goldstein I, Luo Y, Kohane I. Identifying Patient Smoking Status from Medical Discharge Records. J Am Med Inform Assoc 2008; January-February; 15 (01) 14-24 Epub 2007 Oct 18..
  • 34 Ruch P, Baud R, Geissbuhler A. Using lexical disambiguation and named-entity recognition to improve spelling correction in the electronic patient record. Artif Intell Med 2003; 169-84.
  • 35 Tolentino HD, Matters MD, Walop W, Law B, Tong W, Liu F. et al. A UMLS-based spell checker for natural language processing in vaccine safety. BMC Med Inform Decis Mak 2007; 3.
  • 36 Miller G. WordNet: a dictionary browser. Proc of the First International Conference on Information and Data; 1985. Ontario, Canada: 1985
  • 37 Fellbaum C. WordNet: An electronic lexical data-base. Cambridge, MA: MIT Press; 1998
  • 38 Tomanek K, Wermter J, Hahn U. A reappraisal of sentence and token splitting for life sciences documents. Medinfo 2007; 524-8.
  • 39 Weeber M, Mork JG, Aronson AR. Developing a test collection for biomedical word sense disambiguation. Proc AMIA Symp 2001; 746-50.
  • 40 Liu H, Teller V, Friedman C. A multi-aspect comparison study of supervised word sense disambiguation. J Am Med Inform Assoc 2004; 320-31.
  • 41 Xu H, Markatou M, Dimova R, Liu H, Friedman C. Machine learning and word sense disambiguation in the biomedical doma. In: design and evaluation issues BMC Bioinformatics 2006; 334.
  • 42 Pakhomov S, Pedersen T, Chute CG. Abbreviation and acronym disambiguation in clinical discourse. AMIA Annu Symp Proc 2005; 589-93.
  • 43 Coden A, Savova G, Buntrock J, Sominsky I, Ogren PV, Chute CG. et al. Text analysis integration into a medical information retrieval system: challenges related to word sense disambiguation. Medinfo. 2007. Brisbane, Australia: 2007..
  • 44 Campbell DA, Johnson SB. Comparing syntactic complexity in medical and non-medical corpora. Proc AMIA Symp 2001; 90-4.
  • 45 Coden AR, Pakhomov SV, Ando RK, Duffy PH, Chute CG. Domain-specific language models and lexicons for tagging. J Biomed Inform 2005; 422-30.
  • 46 Liu K, Chapman W, Hwa R, Crowley RS. Heuristic sample selection to minimize reference standard training set for a part-of-speech tagger. J Am Med Inform Assoc 2007; 641-50.
  • 47 Hahn U, Wermter J. High-Performance Tagging on Medical Texts. 20th International Conference on Computational Linguistics. Geneva, Switzerland: 2004
  • 48 Campbell DA, Johnson SB. A transformational-based learner for dependency grammars in discharge summaries. Proceedings of the ACL-02 Workshop on Natural Language Processing in the Biomedical Domain. Philadelphia, PN: 2002
  • 49 Clegg AB, Shepherd AJ. Benchmarking natural-language parsers for biological applications using dependency graphs. BMC Bioinformatics 2007; 24.
  • 50 Pyysalo S, Salakoski T, Aubin S, Nazarenko A. Lexical adaptation of link grammar to the biomedical sublanguage: a comparative evaluation of three approaches. BMC Bioinformatics 2006; S2.
  • 51 Aronow DB, Fangfang F, Croft WB. Ad hoc classification of radiology reports. J Am Med Inform Assoc 1999; 393-411.
  • 52 Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform 2001; 301-10.
  • 53 Mitchell KJ, Becich MJ, Berman JJ, Chapman WW, Gilbertson J, Gupta D. et al. Implementation and evaluation of a negation tagger in a pipeline-based system for information extract from pathology reports. Medinfo 2004; 663-7.
  • 54 Goryachev S, Sordo M, Zeng QT, Ngo L. Implementation and Evaluation of Four Different Methods of Negation Detection. DSG technical report?..
  • 55 Mutalik PG, Deshpande A, Nadkarni PM. Use of general-purpose negation detection to augment concept indexing of medical documents: a quantitative study using the UMLS. J Am Med Inform Assoc 2001; 598-609.
  • 56 Elkin PL, Brown SH, Bauer BA, Husser CS, Carruth W, Bergstrom LR. et al. A controlled trial of automated classification of negation from clinical notes. BMC Med Inform Decis Mak 2005; 13.
  • 57 Huang Y, Lowe HJ. A novel hybrid approach to automated negation detection in clinical radiology reports. J Am Med Inform Assoc 2007; 304-11.
  • 58 Hripcsak G, Zhou L, Parsons S, Das AK, Johnson SB. Modeling electronic discharge summaries as a simple temporal constraint satisfaction problem. J Am Med Inform Assoc 2005; Jan-Feb; 12 (01) 55-63.
  • 59 Zhou L, Melton GB, Parsons S, Hripcsak G. A temporal constraint structure for extracting temporal information from clinical narrative. J Biomed Inform 2006; 424-39.
  • 60 Zhou L, Friedman C, Parsons S, Hripcsak G. System architecture for temporal information extraction, representation and reasoning in clinical narrative reports. AMIA Annu Symp Proc 2005; 869-73.
  • 61 Zhou L, Parsons S, Hripcsak G. The Evaluation of a Temporal Reasoning System in Processing Clinical Discharge Summaries. J Am Med Inform Assoc. 2007
  • 62 Harkema H, Setzer A, Gaizauskas R, Hepple M. Mining and Modelling Temporal Clinical Data. Proceedings of the UK e-Science All Hands Meeting 2005; 2005: 507-14.
  • 63 Bramsen P, Deshpande P, Lee YK, Barzilay R. Inducing Temporal Graphs. Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006). Sydney, Australia: 2006: 189-98.
  • 64 Bramsen P, Deshpande P, Lee YK, Barzilay R. Finding temporal order in discharge summaries. AMIA Annu Symp Proc 2006; 81-5.
  • 65 Chu D, Dowling JN, Chapman WW. Evaluating the effectiveness of four contextual features in classifying annotated clinical conditions in emergency department reports. AMIA Annu Symp Proc 2006; 141-5.
  • 66 Chapman W, Chu D, Dowling JN. ConText: An Algorithm for Identifying Contextual Features from Clinical Text. BioNLP 2007: Biological, translational, and clinical language processing. Prague, CZ: 2007
  • 67 Meystre S, Haug PJ. Natural language processing to extract medical problems from electronic clinical documents: performance evaluation. J Biomed Inform 2006; 589-99.
  • 68 Long W. Extracting diagnoses from discharge summaries. AMIA Annu Symp Proc 2005; 470-4.
  • 69 McCray AT, Aronson AR, Browne AC, Rindflesch TC, Razi A, Srinivasan S. UMLS knowledge for biomedical language processing. Bull Med Libr Assoc 1993; Apr; 81 (02) 184-94.
  • 70 Turchin A, Kolatkar NS, Grant RW, Makhni EC, Pendergrass ML, Einbinder JS. Using regular expressions to abstract blood pressure and treatment intensification information from the text of physician notes. J Am Med Inform Assoc 2006; 691-5.
  • 71 Friedlin J, McDonald CJ. A natural language processing system to extract and code concepts relating to congestive heart failure from chest radiology reports. AMIA Annu Symp Proc 2006; 269-73.
  • 72 Bashyam V, Divita G, Bennett DB, Browne AC, Taira RK. A normalized lexical lookup approach to identifying UMLS concepts in free text. Medinfo 2007; 545-9.
  • 73 Divita G, Tse T, Roth L. Failure analysis of MetaMap Transfer (MMTx). Medinfo 2004; 11 Pt 2 763-7.
  • 74 Baud RH, Lovis C, Rassinoux AM, Scherrer JR. Morpho-semantic parsing of medical expressions. Proc AMIA Symp 1998; 760-4.
  • 75 Mamlin BW, Heinze DT, McDonald CJ. Automated extraction and normalization of findings from cancer-related free-text radiology reports. AMIAAnnu Symp Proc 2003; 420-4.
  • 76 Spyns P, De Moor G. A Dutch medical language processor. Int J Biomed Comput 1996; 181-205.
  • 77 Dolin RH, Alschuler L, Beebe C, Biron PV, Boyer SL, Essin D. et al. The HL7 Clinical Document Architecture. J Am Med Inform Assoc 2001; Nov-Dec; 08 (06) 552-69.
  • 78 Huang Y, Lowe HJ, Klein D, Cucina RJ. Improved identification of noun phrases in clinical radiology reports using a high-performance statistical natural language parser augmented with the UMLS specialist lexicon. J Am Med Inform Assoc 2005; 275-85.
  • 79 Friedman C, Alderson PO, Austin JH, Cimino JJ, Johnson SB. A general natural-language text processor for clinical radiology. J Am Med Inform Assoc 1994; Mar-Apr; 01 (02) 161-74.
  • 80 Jain NL, Friedman C. Identification of findings suspicious for breast cancer based on natural language processing of mammogram reports. Proc AMIA Annu Fall Symp 1997; 829-33.
  • 81 Wilcox AB, Narus SP, Bowes 3rd WA. Using natural language processing to analyze physician modifications to data entry templates. Proc AMIA Symp 2002; 899-903.
  • 82 Castilla AC, Furuie SS, Mendonca EA. Multilingual information retrieval in thoracic radiology: feasibility study. Medinfo 2007; 387-91.
  • 83 Schadow G, McDonald CJ. Extracting structured information from free text pathology reports. AMIA Annu Symp Proc 2003; 584-8.
  • 84 Chung J, Murphy S. Concept-value pair extraction from semi-structured clinical narrative: a case study using echocardiogram reports. AMIA Annu Symp Proc 2005; 131-5.
  • 85 Christensen L, Haug P, Fiszman M. MPLUS: A Probabilistic Medical Language Understanding System. BioNLP. 2002
  • 86 Fiszman M, Haug PJ, Frederick PR. Automatic extraction of PIOPED interpretations from ventilation/perfusion lung scan reports. Proc AMIA Symp 1998; 860-4.
  • 87 Fiszman M, Chapman WW, Aronsky D, Evans RS, Haug PJ. Automatic detection of acute bacterial pneumonia from chest X-ray reports. J Am Med Inform Assoc 2000; 593-604.
  • 88 Trick WE, Chapman WW, Wisniewski MF, Peterson BJ, Solomon SL, Weinstein RA. Electronic interpretation of chest radiograph reports to detect central venous catheters. Infect Control Hosp Epidemiol 2003; 950-4.
  • 89 Shared Pathology Informatics Network (SPIN) website. [cited 01/10/2008]; Available from: http://spin.nci.nih.gov/.
  • 90 General Architecture for Text Engineering (GATE) website. [cited 01/10/2008]; Available from: http://gate.ac.uk/.
  • 91 Fenstermacher D, Street C, McSherry T, Nayak V, Overby C, Feldman M. The Cancer Biomedical Informatics Grid (caBIG™). Conf Proc IEEE Eng Med Biol Soc 2005; 1: 743-6.
  • 92 NCI Enterprise Vocabulary Services (EVS) website. [cited 01/10/2008]; Available from: http:// evs.nci.nih.gov/.
  • 93 Zeng QT, Goryachev S, Weiss S, Sordo M, Murphy SN, Lazarus R. Extracting principal diagnosis, comorbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC Med Inform Decis Mak 2006; 30.
  • 94 Denny JC, Spickard 3rd A, Miller RA, Schildcrout J, Darbar D, Rosenbloom ST. et al. Identifying UMLS concepts from ECG Impressions using KnowledgeMap. AMIA Annu Symp Proc 2005; 196-200.
  • 95 Denny JC, Peterson JF. Identifying QT prolongation from ECG impressions using natural language processing and negation detection. Medinfo 2007; 1283-8.
  • 96 Taira RK, Soderland SG. A statistical natural language processor for medical reports. Proc AMIA Symp 1999; 970-4.
  • 97 Bashyam V, Taira RK. Indexing anatomical phrases in neuro-radiology reports to the UMLS 2005AA. AMIA Annu Symp Proc 2005; 26-30.
  • 98 Sibanda T, He T, Szolovits P, Uzuner O. Syntactically-informed semantic category recognition in discharge summaries. AMIA Annu Symp Proc 2006; 714-8.
  • 99 Friedman C, Hripcsak G, Shablinsky I. An evaluation of natural language processing methodologies. Proc AMIA Symp 1998; 855-9.
  • 100 Hripcsak G, Kuperman GJ, Friedman C, Heitjan DF. A reliability study for evaluating information extraction from radiology reports. J Am Med Inform Assoc 1999; 143-50.
  • 101 Friedlin J, McDonald CJ. Using a natural language processing system to extract and code family history data from admission reports. AMIA Annu Symp Proc 2006; 925.
  • 102 Friedman C. Towards a comprehensive medical language processing system: methods and issues. Proc AMIA Annu Fall Symp 1997; 595-9.
  • 103 Clark C, Good K, Jezierny L, Macpherson M, Wilson B, Chajewska U. Identifying Smokers with a Medical Extraction System. J Am Med Inform Assoc 2008; January-February; 15 (01) 36-9 Epub 2007 Oct 18..
  • 104 Cohen AM. Five-way Smoking Status Classification Using Text Hot-Spot Identification and Error-correcting Output Codes. J Am Med Inform Assoc 2008; January-February; 15 (01) 32-5 Epub 2007 Oct 18..
  • 105 Heinze DT, Morsch ML, Potter BC, Sheffer Jr RE. Medical i2b2 NLP Smoking Challenge: The A-Life System Architecture and Methodology. J Am Med Inform Assoc 2008; January-February; 15 (01) 40-3 Epub 2007 Oct 18..
  • 106 Savova GK, Ogren PV, Duffy PH, Buntrock JD, Chute CG. Mayo Clinic NLP System for Patient Smoking Status Identification. J Am Med Inform Assoc 2008; January-February; 15 (01) 25-8 Epub 2007 Oct 18..
  • 107 Wicentowski R, Sydes MR. Using Implicit Information to Identify Smoking Status in Smoke-blind Medical Discharge Summaries. J Am Med Inform Assoc 2008; January-February; 15 1 29-31 Epub 2007 Oct 18..
  • 108 Aronson AR, Bodenreider O, Demner-Fushman D, Fung KW, Lee VK, Mork JG. et al. From indexing the biomedical literature to coding clinical text: experience with MTI and machine learning approaches. BioNLP 2007: Biological, translational, and clinical language processing. Prague, CZ: 2007: 105-12.
  • 109 Crammer K, Dredze M, Ganchev K, Talukdar PP, Carroll S. Automatic Code Assignment to Medical Text. BioNLP 2007: Biological, translational, and clinical language processing. Prague, CZ: 2007: 129-36.
  • 110 Baud R. A natural language based search engine for ICD10 diagnosis encoding. Med Arh 2004; 79-80.
  • 111 Aramaki E, Imai T, Kajino M, Miyo K, Ohe K. Statistical selector of the best multiple ICD-coding method. Medinfo 2007; 12 Pt 1 645-9.
  • 112 Friedman C, Knirsch C, Shagina L, Hripcsak G. Automating a severity score guideline for community-acquired pneumonia employing medical language processing of discharge summaries. Proc AMIA Symp 1999; 256-60.
  • 113 Elkins JS, Friedman C, Boden-Albala B, Sacco RL, Hripcsak G. Coding neuroradiology reports for the Northern Manhattan Stroke Study: a comparison of natural language processing and manual review. Comput Biomed Res 2000; 1-10.
  • 114 Kukafka R, Bales ME, Burkhardt A, Friedman C. Human and automated coding of rehabilitation discharge summaries according to the International Classification of Functioning, Disability, and Health. J Am Med Inform Assoc 2006; 508-15.
  • 115 Lussier YA, Shagina L, Friedman C. Automating SNOMED coding using medical language understanding: a feasibility study. Proc AMIA Symp 2001; 418-22.
  • 116 Hasman A, de Bruijn LM, Arends JW. Evaluation of a method that supports pathology report coding. Methods Inf Med 2001; 40 (04) 293-7.
  • 117 Pakhomov SV, Buntrock JD, Chute CG. Automating the assignment of diagnosis codes to patient encounters using example-based and machine learning techniques. J Am Med Inform Assoc 2006; 516-25.
  • 118 Haug PJ, Christensen L, Gundersen M, Clemons B, Koehler S, Bauer K. A natural language parsing system for encoding admitting diagnoses. Proc AMIA Annu Fall Symp 1997; 814-8.
  • 119 Gundersen ML, Haug PJ, Pryor TA, van Bree R, Koehler S, Bauer K. et al. Development and evaluation of a computerized admission diagnoses encoding system. Comput Biomed Res 1996; 351-72.
  • 120 Kashyap V, Turchin A, Morin L, Chang F, Li Q, Hongsermeier T. Creation of structured documentation templates using Natural Language Processing techniques. AMIAAnnu Symp Proc 2006; 977.
  • 121 Lovis C, Payne TH. Extending the VA CPRS electronic patient record order entry system using natural language processing techniques. Proc AMIA Symp 2000; 517-21.
  • 122 Cimino JJ, Bright TJ, Li J. Medication reconciliation using natural language processing and controlled terminologies. Medinfo 2007; 679-83.
  • 123 Liu H, Friedman C. CliniViewer: a tool for viewing electronic medical records based on natural language processing and XML. Medinfo 2004; 639-43.
  • 124 Day S, Christensen LM, Dalto J, Haug P. Identification of trauma patients at a level 1 trauma center utilizing natural language processing. J Trauma Nurs 2007; 79-83.
  • 125 Mendonca EA, Haas J, Shagina L, Larson E, Fried-man C. Extracting information on pneumonia in infants using natural language processing of radiology reports. J Biomed Inform 2005; 314-21.
  • 126 Fiszman M, Haug PJ. Using medical language processing to support real-time evaluation of pneumonia guidelines. Proc AMIA Symp 2000; 235-9.
  • 127 Jain NL, Knirsch CA, Friedman C, Hripcsak G. Identification of suspected tuberculosis patients based on natural language processing of chest radiograph reports. Proc AMIA Annu Fall Symp 1996; 542-6.
  • 128 Meystre S, Haug PJ. Automation of a problem list using natural language processing. BMC Med Inform Decis Mak 2005; 30.
  • 129 Meystre SM, Haug PJ. Comparing natural language processing tools to extract medical problems from narrative text. AMIA Annu Symp Proc 2005; 525-9.
  • 130 Hazlehurst B, Frost HR, Sittig DF, Stevens VJ. MediClass: A system for detecting and classifying encounter-based clinical events in any electronic medical record. J Am Med Inform Assoc 2005; 517-29.
  • 131 Hazlehurst B, Mullooly J, Naleway A, Crane B. Detecting possible vaccination reactions in clinical notes. AMIA Annu Symp Proc 2005; 306-10.
  • 132 Hazlehurst B, Sittig DF, Stevens VJ, Smith KS, Hollis JF, Vogt TM. et al. Natural language processing in the electronic medical record: assessing clinician adherence to tobacco treatment guidelines. Am J Prev Med 2005; 434-9.
  • 133 Johnson SB, Friedman C. Integrating data from natural language processing into a clinical information system. Proc AMIA Annu Fall Symp 1996; 537-41.
  • 134 Penz JF, Wilcox AB, Hurdle JF. Automated identification of adverse events related to central venous catheters. J Biomed Inform 2007; 174-82.
  • 135 Melton GB, Hripcsak G. Automated detection of adverse events using natural language processing of discharge summaries. J Am Med Inform Assoc 2005; 448-57.
  • 136 Cao H, Stetson P, Hripcsak G. Assessing explicit error reporting in the narrative electronic medical record using keyword searching. J Biomed Inform 2003; 99-105.
  • 137 Chapman WW, Christensen LM, Wagner MM, Haug PJ, Ivanov O, Dowling JN. et al. Classifying free-text triage chief complaints into syndromic categories with natural language processing. Artif Intell Med 2005; 31-40.
  • 138 Haas JP, Mendonca EA, Ross B, Friedman C, Larson E. Use of computerized surveillance to detect nosocomial pneumonia in neonatal intensive care unit patients. Am J Infect Control 2005; 439-43.
  • 139 Chapman WW, Fiszman M, Dowling JN, Chapman BE, Rindflesch TC. Identifying respiratory findings in emergency department reports for biosurveillance using MetaMap. Medinfo 2004; 487-91.
  • 140 Chapman WW, Dowling JN, Wagner MM. Fever detection from free-text clinical records for biosurveillance. J Biomed Inform 2004; 120-7.
  • 141 Brown SH, Speroff T, Fielstein EM, Bauer BA, Wahner-Roedler DL, Greevy R. et al. eQuality: electronic quality assessment from narrative clinical reports. Mayo Clinic proceedings 2006; 1472-81.
  • 142 Pakhomov S, Weston SA, Jacobsen SJ, Chute CG, Meverden R, Roger VL. Electronic medical records for clinical research: application to the identification of heart failure. Am J Manag Care 2007; 281-8.
  • 143 Pakhomov SS, Hemingway H, Weston SA, Jacobsen SJ, Rodeheffer R, Roger VL. Epidemiology of angina pectoris: role of natural language processing of the medical record. Am Heart J 2007; 666-73.
  • 144 Pakhomov SV, Buntrock J, Chute CG. Prospective recruitment of patients with congestive heart failure using an ad-hoc binary classifier. J Biomed Inform 2005; 145-53.
  • 145 Niu Y, Zhu X, Li J, Hirst G. Analysis of polarity information in medical text. AMIA Annu Symp Proc 2005; 570-4.
  • 146 Dorr DA, Phillips WF, Phansalkar S, Sims SA, Hurdle JF. Assessing the difficulty and time cost of de-identification in clinical narratives. Methods Inf Med 2006; 246-52.
  • 147 Sweeney L. Replacing personally-identifying information in medical records, the Scrub system. Proc AMIA Annu Fall Symp 1996; 333-7.
  • 148 Ruch P, Baud RH, Rassinoux AM, Bouillon P, Robert G. Medical document anonymization with a semantic lexicon. Proc AMIA Symp 2000; 729-33.
  • 149 Taira RK, Bui AA, Kangarloo H. Identification of patient name references within medical documents using semantic selectional restrictions. ProcAMIA Symp 2002; 757-61.
  • 150 Thomas SM, Mamlin B, Schadow G, McDonald C. A successful technique for removing names in pathology reports using an augmented search and replace method. Proc AMIA Symp 2002; 777-81.
  • 151 Berman JJ. Concept-match medical data scrubbing. How pathology text can be used in research. Archives of pathology & laboratory medicine 2003; 680-6.
  • 152 Fielstein EM, Brown SH, Speroff T. Algorithmic De-identification of VA Medical Exam Text for HIPAA Privacy Compliance: Preliminary Findings. Medinfo 2004; 1590.
  • 153 Gupta D, Saul M, Gilbertson J. Evaluation of a deidentification (De-Id) software engine to share pathology reports and clinical documents for research. Am J Clin Pathol 2004; 176-86.
  • 154 Beckwith BA, Mahaadevan R, Balis UJ, Kuo F. Development and evaluation of an open source software tool for deidentification of pathology reports. BMC Med Inform Decis Mak 2006; 12.
  • 155 Sibanda T, Uzuner O. Role of Local Context in Automatic Deidenti?cation of Ungrammatical, Fragmented Text. ACL conference. 2006
  • 156 Wellner B, Huyck M, Mardis S, Aberdeen J, Morgan A, Peshkin L. et al. Rapidly retargetable approaches to de-identification in medical records. J Am Med Inform Assoc 2007; 564-73.
  • 157 Szarvas G, Farkas R, Busa-Fekete R. State-of-the-art anonymization of medical records using an iterative machine learning framework. J Am Med Inform Assoc 2007; 574-80.
  • 158 Ananiadou S, Nenadic G. Automatic Terminology Management in Biomedicine. In: Ananiadou S, McNaught J. Text Mining for Biology and Biomedicine: Artech House Books. 2006: 67-98.
  • 159 Baneyx A, Charlet J, Jaulent MC. Methodology to build medical ontology from textual resources. AMIA Annu Symp Proc 2006; 21-5.
  • 160 Baneyx A, Charlet J, Jaulent MC. Building an ontology of pulmonary diseases with natural language processing tools using textual corpora. Int J Med Inform 2007; 208-15.
  • 161 Zhou L, Tao Y, Cimino JJ, Chen ES, Liu H, Lussier YA. et al. Terminology model discovery using natural language processing and visualization techniques. J Biomed Inform 2006; 626-36.
  • 162 Charlet J, Bachimont B, Jaulent MC. Building medical ontologies by terminology extraction from texts: an experiment for the intensive care units. Comput Biol Med 2006; 857-70.
  • 163 Kolesa P, Preckova P. Tools for Czech biomedical ontologies creation. Stud Health Technol Inform 2006; 775-80.
  • 164 Hersh WR, Campbell EH, Evans DA, Brownlow ND. Empirical, automated vocabulary discovery using large text corpora and advanced natural language processing tools. Proc AMIA Annu Fall Symp 1996; 159-63.
  • 165 Harris MR, Savova GK, Johnson TM, Chute CG. A term extraction tool for expanding content in the domain of functioning, disability, and health: proof of concept. J Biomed Inform 2003; Aug-Oct; 36 4-5 250-9.
  • 166 Savova GK, Harris M, Johnson T, Pakhomov SV, Chute CG. A data-driven approach for extracting “the most specific term” for ontology development. AMIA Annu Symp Proc 2003; 579-83.
  • 167 Savova G, Becker D, Harris M, Chute CG. Combining Rule-Based Methods and Latent Semantic Analysis for Ontology Structure Construction. Medinfo. 2004. San Francisco, CA; 2004: 1848.
  • 168 do MBAmaral, Roberts A, Rector AL. NLP techniques associated with the OpenGALEN ontology for semi-automatic textual extraction of medical knowledge: abstracting and mapping equivalent linguistic and logical constructs. Proc AMIA Symp 2000; 76-80.
  • 169 Friedman C, Liu H, Shagina L. A vocabulary development and visualization tool based on natural language processing and the mining of textual patient reports. J Biomed Inform 2003; 189-201.
  • 170 UMLS Knowledge Source Server (UMLSKS). [cited 01/10/2008]; Available from: http:// umlsks.nlm.nih.gov.
  • 171 The Lexical Grid (LexGrid). [cited 01/10/2008]; Available from: http://informatics.mayo.edu/ LexGrid/index.php?page=
  • 172 Tuttle MS, Olson NE, Keck KD, Cole WG, Erlbaum MS, Sherertz DD. et al. Metaphrase: an aid to the clinical conceptualization and formalization of patient problems in healthcare enterprises. Methods Inf Med 1998; Nov; 37 4-5 373-83.
  • 173 Cimino JJ, Clayton PD, Hripcsak G, Johnson SB. Knowledge-based approaches to the maintenance of a large controlled medical terminology. J Am Med Inform Assoc 1994; Jan-Feb; 01 (01) 35-50.
  • 174 Chapman WW, Dowling JN. Inductive creation of an annotation schema for manually indexing clinical conditions from emergency department reports. J Biomed Inform 2006; 196-208.
  • 175 Chapman WW, Dowling JN, Hripcsak G. Evaluation of training with an annotation schema for manual annotation of clinical conditions from emergency department reports. Int J Med Inform. 2007
  • 176 Cohen KB, Fox L, Ogren PV, Hunter L. Empirical data on corpus design and usage in biomedical natural language processing. AMIA Annu Symp Proc 2005; 156-60.
  • 177 Cohen KB, Fox L, Ogren PV, Hunter L. Corpus design for biomedical natural language processing. AC-ISMB Workshop on Linking Biological Literature, Ontologies and Databases; 2005. Association for Computational Linguistics 2005; 38-45.
  • 178 Wilbur WJ, Rzhetsky A, Shatkay H. New directions in biomedical text annotation: definitions, guidelines and corpus construction. BMC Bioinformatics 2006; 356.
  • 179 Hirschman L, Blaschke C. Evaluation of Text Mining in Biology. In: Ananiadou S, McNaught J. editors. Text Mining for Biology and Biomedicine: Artech House Books. 2006: 67-98.
  • 180 Swanson DR. Two medical literatures that are logically but not bibliographically connected. JASIS 1987; 38 (04) 228-33.
  • 181 Chen ES, Hripcsak G, Xu H, Markatou M, Friedman C. Automated Acquisition of Disease-Drug Knowledge from Biomedical and Clinical Documents: An Initial Study. J Am Med Inform Assoc. 2007
  • 182 Cao H, Hripcsak G, Markatou M. A statistical methodology for analyzing co-occurrence data from a large sample. J Biomed Inform 2007; Jun; 40 (03) 343-52.
  • 183 Cao H, Markatou M, Melton GB, Chiang MF, Hripcsak G. Mining a clinical data warehouse to discover disease-finding associations using cooccurrence statistics. AMIA Annu Symp Proc 2005; 106-10.
  • 184 Rindflesch TC, Pakhomov SV, Fiszman M, Kilicoglu H, Sanchez VR. Medical facts to support inferencing in natural language processing. AMIA Annu Symp Proc 2005; 634-8.