Linking Electronic Health Record and Trauma Registry Data: Assessing the Value of Probabilistic LinkageFunding This study was supported by the Agency for Healthcare Research and Quality (R01HS023837, PI: Gurses).
23 April 2018
02 August 2018
15 March 2019 (online)
Background Electronic health record (EHR) systems contain large volumes of novel heterogeneous data that can be linked to trauma registry data to enable innovative research not possible with either data source alone.
Objective This article describes an approach for linking electronically extracted EHR data to trauma registry data at the institutional level and assesses the value of probabilistic linkage.
Methods Encounter data were independently obtained from the EHR data warehouse (n = 1,632) and the pediatric trauma registry (n = 1,829) at a Level I pediatric trauma center. Deterministic linkage was attempted using nine different combinations of medical record number (MRN), encounter identity (ID) (visit ID), age, gender, and emergency department (ED) arrival date. True matches from the best performing variable combination were used to create a gold standard, which was used to evaluate the performance of each variable combination, and to train a probabilistic algorithm that was separately used to link records unmatched by deterministic linkage and the entire cohort. Additional records that matched probabilistically were investigated via chart review and compared against records that matched deterministically.
Results Deterministic linkage with exact matching on any three of MRN, encounter ID, age, gender, and ED arrival date gave the best yield of 1,276 true matches while an additional probabilistic linkage step following deterministic linkage yielded 110 true matches. These records contained a significantly higher number of boys compared to records that matched deterministically and etiology was attributable to mismatch between MRNs in the two data sets. Probabilistic linkage of the entire cohort yielded 1,363 true matches.
Conclusion The combination of deterministic and an additional probabilistic method represents a robust approach for linking EHR data to trauma registry data. This approach may be generalizable to studies involving other registries and databases.
Keywordsrecord linkage - deterministic linkage - probabilistic linkage - trauma registry - electronic health records
The use of both EHR and registry data for research was approved by the institutional review board of the Johns Hopkins Medicine. The study received a waiver of need for informed consent.
Human Subjects Protections
The study was performed in compliance with the World Medical Association Declaration of Helsinki on Ethical Principles for Medical Research Involving Human Subjects, and was reviewed by the Johns Hopkins Medicine Institutional Review Board.
- 1 Moore L, Clark DE. The value of trauma registries. Injury 2008; 39 (06) 686-695
- 2 Zehtabchi S, Nishijima DK, McKay MP, Mann NC. Trauma registries: history, logistics, limitations, and contributions to emergency medicine research. Acad Emerg Med 2011; 18 (06) 637-643
- 3 Gliklich R, Dreyer N, Leavy M. Linking registry data with other data sources to support new studies. In Registries for Evaluating Patient Outcomes: A User's Guide [Internet]. 3rd ed. Rockville, MD: Agency for Healthcare Research and Quality (US); 2014
- 4 Gliklich RE, Dreyer NA, Leavy MB. Interfacing registries with electronic health records. In: Registries for Evaluating Patient Outcomes: A User's Guide [Internet]. Rockville, MD: Agency for Healthcare Research and Quality (US); 2014
- 5 Barnett ML, Mehrotra A, Jena AB. Adverse inpatient outcomes during the transition to a new electronic health record system: observational study. BMJ 2016; 354: i3835
- 6 Gettinger A, Csatari A. Transitioning from a legacy EHR to a commercial, vendor-supplied, EHR: one academic health system's experience. Appl Clin Inform 2012; 3 (04) 367-376
- 7 Bornstein S. An integrated EHR at Northern California Kaiser Permanente: pitfalls, challenges, and benefits experienced in transitioning. Appl Clin Inform 2012; 3 (03) 318-325
- 8 Newgard CD, Zive D, Jui J, Weathers C, Daya M. Electronic versus manual data processing: evaluating the use of electronic health records in out-of-hospital clinical research. Acad Emerg Med 2012; 19 (02) 217-227
- 9 Knake LA, Ahuja M, McDonald EL. , et al. Quality of EHR data extractions for studies of preterm birth in a tertiary care center: guidelines for obtaining reliable data. BMC Pediatr 2016; 16: 59
- 10 Dusetzina SB, Tyree S, Meyer AM. , et al. Linking Data for Health Services Research: A Framework and Instructional Guide [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); ; 2014 Sep. 4, An Overview of Record Linkage Methods.
- 11 Clark DE. Practical introduction to record linkage for injury research. Inj Prev 2004; 10 (03) 186-191
- 12 Sayers A, Ben-Shlomo Y, Blom AW, Steele F. Probabilistic record linkage. Int J Epidemiol 2016; 45 (03) 954-964
- 13 Zhu Y, Matsuyama Y, Ohashi Y, Setoguchi S. When to conduct probabilistic linkage vs. deterministic linkage? A simulation study. J Biomed Inform 2015; 56: 80-86
- 14 Harron K, Wade A, Gilbert R, Muller-Pebody B, Goldstein H. Evaluating bias due to data linkage error in electronic healthcare records. BMC Med Res Methodol 2014; 14: 36
- 15 McKenzie K, Walker S, Besenyei A, Aitken LM, Allison B. Assessing the concordance of trauma registry data and hospital records. Health Inf Manag 2005; 34 (01) 3-7
- 16 Wynn A, Wise M, Wright MJ. , et al. Accuracy of administrative and trauma registry databases. J Trauma 2001; 51 (03) 464-468
- 17 United States Department of Health and Human Services. Health insurance portability and accountability act of 1996. Available at: http://www.hhs.gov/ocr/hipaa/
- 18 Newgard CD. Validation of probabilistic linkage to match de-identified ambulance records to a state trauma registry. Acad Emerg Med 2006; 13 (01) 69-75
- 19 Schmidlin K, Clough-Gorr KM, Spoerri A. ; SNC study group. Privacy preserving probabilistic record linkage (P3RL): a novel method for linking existing health-related data and maintaining participant confidentiality. BMC Med Res Methodol 2015; 15: 46
- 20 Spath P. Debate: retrospective vs. concurrent data collection. Hosp Peer Rev 1999; 24 (05) 80-82
- 21 Maryland State Trauma Registry Data Dictionary for Pediatric Patients. 2014. Available at: https://www.miemss.org/home/Portals/0/Docs/OtherPDFs/Web-registry-data-dictionary-pediatric.pdf?ver¼2016-03-10-140444-350 . Accessed March 1, 2018
- 22 Newcombe HB, Kennedy JM. Record linkage: making maximum use of the discriminating power of identifying information. Commun ACM 1962; 5 (11) 563-566
- 23 Newcombe HB, Kennedy JM, Axford SJ, James AP. Automatic linkage of vital records. Science 1959; 130 (3381): 954-959
- 24 Fellegi IP, Sunter AB. A theory for record linkage. J Am Stat Assoc 1969; 64 (328) 1183-1210
- 25 Grannis SJ, Overhage JM, Hui S, McDonald CJ. Analysis of a probabilistic record linkage technique without human review. AMIA Annu Symp Proc 2003; 2003: 259-263
- 26 Cook LJ, Olson LM, Dean JM. Probabilistic record linkage: relationships between file sizes, identifiers and match weights. Methods Inf Med 2001; 40 (03) 196-203
- 27 American College of Surgeons. National Trauma Data Standard Data Dictionary 2017 Admissions; 2017
- 28 StataCorp. 2013. Stata Statistical Software: Release 13. College Station, TX:
- 29 Hripcsak G, Friedman C, Alderson PO, DuMouchel W, Johnson SB, Clayton PD. Unlocking clinical data from narrative reports: a study of natural language processing. Ann Intern Med 1995; 122 (09) 681-688
- 30 Sun H, Depraetere K, De Roo J. , et al. Semantic processing of EHR data for clinical research. J Biomed Inform 2015; 58: 247-259
- 31 O'Reilly GM, Gabbe B, Moore L, Cameron PA. Classifying, measuring and improving the quality of data in trauma registries: a review of the literature. Injury 2016; 47 (03) 559-567
- 32 O'Reilly GM, Gabbe B, Braaf S, Cameron PA. An interview of trauma registry custodians to determine lessons learnt. Injury 2016; 47 (01) 116-124
- 33 Brooks AJ, Macnab C, Boffard K. AKA unknown male Foxtrot 23/4: alias assignment for unidentified emergency room patients. J Accid Emerg Med 1999; 16 (03) 171-173
- 34 Robinson G, Fortune JB, Wachtel TL, Frank HA, Long WB. A system of alias assignment for unidentified patients requiring emergency hospital admission. J Trauma 1985; 25 (04) 333-336
- 35 Landman A, Teich JM, Pruitt P. , et al. The Boston Marathon bombings mass casualty incident: one emergency department's information systems challenges and opportunities. Ann Emerg Med 2015; 66 (01) 51-59
- 36 Blank-Reid CA, Kaplan LJ. A system for working with unidentified trauma patients. Int J Trauma Nurs 1996; 2 (04) 108-110
- 37 American College of Surgeons Committee on Trauma. Trauma Registry. In: Resources for Optimal Care of the Injured Patient. Chicago, IL: American College of Surgeons; 2014: 107-113
- 38 Boufous S, Williamson A. Reporting of the incidence of hospitalised injuries: numerator issues. Inj Prev 2003; 9 (04) 370-375