Automating case definitions using literature-based reasoning

T. Botsis; R. Ball

doi:10.4338/ACI-2013-04-RA-0028

RSS-Feed abonnieren

Bitte kopieren Sie die angezeigte URL und fügen sie dann in Ihren RSS-Reader ein.

https://www.thieme-connect.de/rss/thieme/de/10.1055-s-00035026.xml

Teilen / Bookmarken

Facebook Linkedin Weibo

PDF herunterladen

Appl Clin Inform 2013; 04(04): 515-527
DOI: 10.4338/ACI-2013-04-RA-0028

Research Article

Schattauer GmbH

Automating case definitions using literature-based reasoning

T. Botsis

¹Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research (CBER), Food and Drug Administration (FDA), Rockville, MD

²Department of Computer Science, University of Tromsø, Tromsø, Norway

,

R. Ball

¹Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research (CBER), Food and Drug Administration (FDA), Rockville, MD

› Institutsangaben

Weitere Informationen

Publikationsverlauf

received: 24. April 2013

accepted: 08. Oktober 2013

Publikationsdatum:
19. Dezember 2017 (online)

Auch verfügbar auf

Abstract
Volltext
Referenzen
Zusatzmaterial

Lizenzen und Reprints

Summary

Background: Establishing a Case Definition (CDef) is a first step in many epidemiological, clinical, surveillance, and research activities. The application of CDefs still relies on manual steps and this is a major source of inefficiency in surveillance and research.

Objective: Describe the need and propose an approach for automating the useful representation of CDefs for medical conditions.

Methods: We translated the existing Brighton Collaboration CDef for anaphylaxis by mostly relying on the identification of synonyms for the criteria of the CDef using the NLM MetaMap tool. We also generated a CDef for the same condition using all the related PubMed abstracts, processing them with a text mining tool, and further treating the synonyms with the above strategy. The co-occur-rence of the anaphylaxis and any other medical term within the same sentence of the abstracts supported the construction of a large semantic network. The ‘islands’ algorithm reduced the network and revealed its densest region including the nodes that were used to represent the key criteria of the CDef. We evaluated the ability of the “translated” and the “generated” CDef to classify a set of 6034 H1N1 reports for anaphylaxis using two similarity approaches and comparing them with our previous semi-automated classification approach.

Results: Overall classification performance across approaches to producing CDefs was similar, with the generated CDef and vector space model with cosine similarity having the highest accuracy (0.825±0.003) and the semi-automated approach and vector space model with cosine similarity having the highest recall (0.809±0.042). Precision was low for all approaches.

Conclusion: The useful representation of CDefs is a complicated task but potentially offers substantial gains in efficiency to support safety and clinical surveillance.

Citation: Botsis T, Ball R. Automating case definitions using literature-based reasoning. Appl Clin Inf 2013; 4: 515–527

http://dx.doi.org/10.4338/ACI-2013-04-RA-0028

Keywords

Case definition - safety surveillance - semantic networks - literature-based reasoning - anaphylaxis - similarity

Zusatzmaterial

References
1 Merrill R. Introduction to Epidemiology. 5th ed. Jones & Bartlett Learning; 2010

MissingFormLabel
Suche in Google Scholar
2 Ghanaie RM, Karimi A, Sadeghi H, Esteghamti A, Falah F, Armin S, Fahimzad A, Shamshiri A, Kahbazi M, Shiva F. Sensitivity and specificity of the World Health Organization pertussis clinical case definition. International Journal of Infectious Diseases 2010; 14 (12) e1072-e1075.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
3 CDC.. National Notifiable Diseases Surveillance System (NNDSS). December 7, 2012. Available from: http://wwwn.cdc.gov/nndss.

MissingFormLabel
PubMed
4 Koo D, Wharton M, Birkhead G. Case Definitions for Infectious Conditions Under Public Health Surveil-lance. MMWR Recomm Rep 1997; 46 RR-10 1-64.

MissingFormLabel
PubMed Suche in Google Scholar
5 Wharton M, Chorba TL, Vogt RL, Morse DL, Buehler JW. Case definitions for public health surveillance. MMWR Recomm Rep 1990; 39 RR-13 1-43.

MissingFormLabel
PubMed Suche in Google Scholar
6 Bonhoeffer J, Kohl K, Chen R, Duclos P, Heijbel H, Heininger U, Jefferson T, Loupi E. The Brighton Collaboration: addressing the need for standardized case definitions of adverse events following immunization (AEFI). Vaccine 2002; 21 (03) 298-302.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
7 Ball R, Halsey N, Braun MM, Moulton LH, Gale AD, Rammohan K, Wiznitzer M, Johnson R, Salive ME. Development of case definitions for acute encephalopathy, encephalitis, and multiple sclerosis reports to the Vaccine Adverse Event Reporting System. Journal of Clinical Epidemiology 2002; 55 (08) 819-824.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
8 Berry SH, Bogart LM, Pham C, KARIN LIU, Nyberg L, Stoto M, Suttorp M, Clemens JQ. Development, validation and testing of an epidemiological case definition of interstitial cystitis/painful bladder syndrome. The Journal of Urology 2010; 183 (05) 1848-1852.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
9 Bines JE, Ivanoff B, Justice F, Mulholland K. Clinical case definition for the diagnosis of acute intussusception. Journal of Pediatric Gastroenterology and Nutrition 2004; 39 (05) 511-518.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
10 Eisenhardt KM. Building theories from case study research. Academy of Management Review 1989; 532-550.

MissingFormLabel
PubMed Suche in Google Scholar
11 Hullermeier E. Case-based approximate reasoning. 44 ed. Springer; 2007

MissingFormLabel
Suche in Google Scholar
12 Cunningham A, Stein CM, Chung CP, Daugherty JR, Smalley WE, Ray WA. An automated database case definition for serious bleeding related to oral anticoagulant use. Pharmacoepidemiology and Drug Safety 2011; 20 (06) 560-566.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
13 Leslie WD, Lix LM, Yogendran MS. Validation of a case definition for osteoporosis disease surveillance. Osteoporosis International 2011; 22 (01) 37-46.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
14 Reid AY. et al. Development and validation of a case definition for epilepsy for use with administrative health data. Epilepsy Res 2012; 102 (03) 173-179.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
15 Parks S, Sugerman D, Xu L, Coronado V. Characteristics of non-fatal abusive head trauma among children in the USA, 2003–2008: application of the CDC operational case definition to national hospital inpatient data. Injury Prevention 2012; 18 (06) 392-398.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
16 Desai JR, Wu P, Nichols GA, Lieu TA, O’ Connor PJ. Diabetes and Asthma Case identification, validation, and representativeness when using electronic health data to construct registries for comparative effectiveness and epidemiologic research. Medical Care 2012; 50: S30-S35.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
17 Afzal Z, Schuemie MJ, van Blijderveen JC, Sen EF, Sturkenboom MC, Kors JA. Improving sensitivity of machine learning methods for automated case identification from free-text electronic medical records. BMC Medical Informatics and Decision Making 2013; 13 (01) 1-11.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
18 Kohl KS, Magnus M, Ball R, Halsey N, Shadomy S, Farley TA. Applicability, reliability, sensitivity, and specificity of six Brighton Collaboration standardized case definitions for adverse events following immunization. Vaccine 2008; 26 (050) 6349-6360.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
19 Ruggeberg JU, Gold MS, Bayas JM, Blum MD, Bonhoeffer J, Friedlander S, de Souza BG, Heininger U, Imoukhuede B, Khamesipour A. Anaphylaxis: case definition and guidelines for data collection, analysis, and presentation of immunization safety data. Vaccine 2007; 25 (31) 5675.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
20 Botsis T, Nguyen MD, Woo EJ, Markatou M, Ball R. Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection. Journal of the American Medical Informatics Association 2011; 18 (05) 631-638.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
21 Cao H, Melton GB, Markatou M, Hripcsak G. Use abstracted patient-specific features to assist an information-theoretic measurement to assess similarity between medical cases. Journal of Biomedical Informatics 2008; 41 (06) 882-888.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
22 Batet M, Sánchez D, Valls A. An ontology-based measure to compute semantic similarity in biomedicine. Journal of Biomedical Informatics 2011; 44 (01) 118-125.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
23 Begum S, Ahmed MU, Funk P, Xiong N, Von Scheele B. A case-based decision support system for individual stress diagnosis using fuzzy similarity matching. Computational Intelligence 2009; 25 (03) 180-195.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
24 Huang ML, Hung YH, Lee WM, Li RK, Wang TH. Usage of case-based reasoning, neural network and adaptive neuro-fuzzy inference system classification techniques in breast cancer dataset classification diagnosis. Journal of Medical Systems 2012; 36 (02) 407-414.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
25 van den Branden M, Wiratunga N, Burton D, Craw S. Integrating case-based reasoning with an electronic patient record system. Artificial Intelligence in Medicine 2011; 51 (Suppl. 02) 117-123.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
26 Chen ES, Hripcsak G, Xu H, Markatou M, Friedman C. Automated acquisition of diseaseGÇôdrug knowledge from biomedical and clinical documents: an initial study. Journal of the American Medical Informatics Association 2008; 15 (01) 87-98.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
27 Markatou M, Don PK, Hu J, Wang F, Sun J, Sorrentino R, Ebadollahi S. Case-based reasoning in comparative effectiveness research. IBM Journal of Research and Development 2012; 56 (05) 4-1.

MissingFormLabel
PubMed Suche in Google Scholar
28 Bichindaritz I, Marling C. Case-based reasoning in the health sciences: What’s next?. Artificial Intelligence in Medicine 2006; 36 (02) 127-135.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
29 Letang E, Naniche D, Bower M, Miro JM. Kaposi sarcoma-associated immune reconstitution inflammatory syndrome: In need of a specific case definition. Clinical Infectious Diseases 2012; 55 (01) 157-158.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
30 Botsis T, Buttolph T, Nguyen MD, Winiecki S, Woo EJ, Ball R. Vaccine adverse event text mining system for extracting features from vaccine safety reports. Journal of the American Medical Informatics Association 2012; 19 (06) 1011-1018.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
31 Aronson AR. Metamap: Mapping text to the UMLS metathesaurus. Bethesda, MD: NLM, NIH, DHHS; 2006

MissingFormLabel
Suche in Google Scholar
32 Aronson AR, Lang FM. An overview of MetaMap: historical perspective and recent advances. Journal of the American Medical Informatics Association 2010; 17 (03) 229-236.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
33 Bundschus M, Dejori M, Stetter M, Tresp V, Kriegel HP. Extraction of semantic biomedical relations from text using conditional random fields. BMC Bioinformatics 2008; 9 (01) 207.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
34 Spacic J, Jensen LJ, Ouzounova R, Rojas I, Bork P. Extraction of regulatory gene/protein networks from Medline. Bioinformatics 2006; 22 (06) 645-650.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
35 Cameron D, Bodenreider O, Yalamanchili H, Danh T, Vallabhaneni S, Thirunarayan K, Sheth AP, Rind-flesch TC. A graph-based recovery and decomposition of swanson’s hypothesis using semantic predications. Journal of Biomedical Informations 2013; 46 (02) 238-251.

MissingFormLabel
PubMed Suche in Google Scholar
36 Miller CM, Rindflesch TC, Fiszman M, Hristovski D, Shin D, Rosemblat G, Zhang H, Strohl KP. A closed literature-based discovery technique finds a mechanistic link between hypogonadism and diminished sleep quality in aging men. Sleep 2012; 35 (02) 279.

MissingFormLabel
PubMed Suche in Google Scholar
37 Wilkowski B, Fiszman M, Miller CM, Hristovski D, Arabandi S, Rosemblat G, Rindflesch TC. Graph-Based Methods for Discovery Browsing with Semantic Predications. American Medical Informatics Association; Annual Meeting 2011. p. 1514.

MissingFormLabel
Suche in Google Scholar
38 Yetisgen-Yildiz M, Pratt W. A new evaluation methodology for literature-based discovery systems. Journal of Biomedical Informatics 2009; 42 (04) 633.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
39 Coulet A, Shah NH, Garten Y, Musen M, Altman RB. Using text to build semantic networks for pharmacogenomics. Journal of Biomedical Informatics 2010; 43 (06) 1009-1019.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
40 Fundel K, Kuffner R, Zimmer R. RelEx-Relation extraction using dependency parse trees. Bioinformatics 2007; 23 (03) 365-371.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
41 Barnickel T, Weston J, Collobert R, Mewes HW, Stümpflen V. Large scale application of neural network based semantic role labeling for automated relation extraction from biomedical texts. PLoS One 2009; 4 (07) e6393.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
42 Bethard S, Lu Z, Martin JH, Hunter L. Semantic role labeling for protein transport predicates. BMC Bioinformatics 2008; 9 (01) 277.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
43 Kogan Y, Collier N, Pakhomov S, Krauthammer M. Towards semantic role labeling & IE in the medical literature. American Medical Informatics Association Annual Meeting. 2005: 410.

MissingFormLabel
PubMed Suche in Google Scholar
44 Zaversnik M, Batagelj V. Islands. Sunbelt XXIV Portoroz, Slovenia.

MissingFormLabel
PubMed
45 De Nooy W, Mrvar A, Batagelj V. Exploratory social network analysis with Pajek. 34 ed. Cambridge Univ Press; 2011

MissingFormLabel
Suche in Google Scholar
46 Ball R, Botsis T. Can Network Analysis Improve Pattern Recognition Among Adverse Events Following Immunization Reported to VAERS&quest. Clinical Pharmacology & Therapeutics 2011; 90 (02) 271-278.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
47 NLM.. UMLS® Reference Manual. September 2009. Available from http://www.ncbi.nlm.nih.gov/booksNBK9676/

MissingFormLabel
PubMed
48 Brown EG, Wood L, Wood S. The medical dictionary for regulatory activities (MedDRA). Drug Safety 1999; 20 (Suppl. 02) 109-117.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
49 Manning CD, Raghavan P, Schutze H. Introduction to information retrieval. 1 ed. Cambridge University Press Cambridge; 2008

MissingFormLabel
Suche in Google Scholar
50 Lin D. An information-theoretic definition of similarity. ICML. 1998: 296-304.

MissingFormLabel
PubMed Suche in Google Scholar
51 Aslam JA, Frost M. An information-theoretic measure for document similarity. SIGIR. 2003: 449-450.

MissingFormLabel
PubMed Suche in Google Scholar
52 Kohl KS, Bonhoeffer J, Braun MM, Chen RT, Duclos P, Heijbel H, Heininger U, Loupi E. The Brighton Collaboration: Creating a global standard for case definitions (and guidelines) for adverse events following immunization. Advances in Patient Safety 2005; 2: 87-102.

MissingFormLabel
PubMed Suche in Google Scholar
53 van Haagen HH, ‘t Hoen P, Bovo AB, de Morree A, van Mulligen EM, Chichester C, Kors JA, den Dunnen JT, van Ommen GJB, van der Maarel SM. Novel protein-protein interactions inferred from literature context. PLoS One 2009; 4 (011) e7894.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
54 Cohen T, Widdows D. Empirical distributional semantics: methods and biomedical applications. Journal of Biomedical Informatics 2009; 42 (02) 390-405.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
55 Huang K, Geller J, Halper M, Cimino JJ. Piecewise synonyms for enhanced UMLS source terminology integration. American Medical Informatics Association Annual Meeting. 2007: 339.

MissingFormLabel
PubMed Suche in Google Scholar
56 Huang KC, Geller J, Halper M, Perl Y, Xu J. Using WordNet synonym substitution to enhance UMLS source integration. Artificial Intelligence in Medicine 2009; 46 (02) 97-109.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
57 Ozgur A, Xiang Z, Radev DR, He Y. Literature-based discovery of IFN and vaccine-Mediated Gene Interaction Networks. Journal of Biomedicine and Biotechnology 2010; 2010: 426479.

MissingFormLabel
PubMed Suche in Google Scholar
58 Schuemie MJ, Kors JA, Mons B. Word sense disambiguation in the biomedical domain: an overview. Journal of Computational Biology 2005; 12 (05) 554-565.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
59 Xu H, Markatou M, Dimova R, Liu H, Friedman C. Machine learning and word sense disambiguation in the biomedical domain: design and evaluation issues. BMC Bioinformatics 2006; 7 (01) 334.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
60 Cheng XQ, Ren FX, Zhou S, Hu MB. Triangular clustering in document networks. New Journal of Physics 2009; 11 (03) 033019.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
61 Swanson DR. Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine 1986; 30 (01) 7.

MissingFormLabel
Crossref PubMed Suche in Google Scholar

Zusatzmaterial

Zusatzmaterial

RSS-Feed abonnieren

Teilen / Bookmarken

Automating case definitions using literature-based reasoning

Publikationsverlauf

Summary

Keywords

References