Measuring the Degree of Unmatched Patient Records in a Health Information Exchange Using Exact Matching
01 December 2015
accepted: 26 February 2016
16 December 2017 (online)
Health information exchange (HIE) facilitates the exchange of patient information across different healthcare organizations. To match patient records across sites, HIEs usually rely on a master patient index (MPI), a database responsible for determining which medical records at different healthcare facilities belong to the same patient. A single patient’s records may be improperly split across multiple profiles in the MPI.
We investigated the how often two individuals shared the same first name, last name, and date of birth in the Social Security Death Master File (SSDMF), a US government database containing over 85 million individuals, to determine the feasibility of using exact matching as a split record detection tool. We demonstrated how a method based on exact record matching could be used to partially measure the degree of probable split patient records in the MPI of an HIE.
We calculated the percentage of individuals who were uniquely identified in the SSDMF using first name, last name, and date of birth. We defined a measure consisting of the average number of unique identifiers associated with a given first name, last name, and date of birth. We calculated a reference value for this measure on a subsample of SSDMF data. We compared this measure value to data from a functioning HIE.
We found that it was unlikely for two individuals to share the same first name, last name, and date of birth in a large US database including over 85 million individuals. 98.81% of individuals were uniquely identified in this dataset using only these three items. We compared the value of our measure on a subsample of Social Security data (1.00089) to that of HIE data (1.1238) and found a significant difference (t-test p-value < 0.001).
This method may assist HIEs in detecting split patient records.
- 1 The National Alliance for Health Information Technology Report to the Office of the National Coordinator for Health Information Technology on Defining Key Health Information Technology Terms. 2008
- 2 Fellegi IP, Sunter AB. A Theory For Record Linkage. J Am Stat Assoc 1969; 64 (328) 1183-1210.
- 3 Grannis SJ, Overhage JM, McDonald CJ. Analysis of identifier performance using a deterministic linkage algorithm. AMIA Annu Symp Proc 2002; 305-309.
- 4 Texas A&M Health Science Center Population Informatics Research Group. Record Linkage Basics [Internet]. Available from: http://research.tamhsc.edu/pinformatics/record-linkage-basics/
- 5 Bradley CJ, Penberthy L, Devers KJ, Holden DJ. Health services research and data linkages: Issues, methods, and directions for the future. Health Serv Res 2010; 45 (5 PART 2): 1468-1488.
- 6 Grannis SJ, Overhage JM, Hui S, McDonald CJ. Analysis of a probabilistic record linkage technique without human review. AMIA Annu Symp Proc 2003; 259-263.
- 7 McCoy AB, Wright A, Kahn MG, Shapiro JS, Bernstam EV, Sittig DF. Matching identifiers in electronic health records: implications for duplicate records and patient safety. BMJ Qual Saf 2013; 22: 219-224.
- 8 Joffe E, Bearden CF, Byrne MJ, Bernstam E V. Duplicate Patient Records – Implication for Missed Laboratory Results. In: AMIA Annu Symp Proc; 2012: 1269-1275.
- 9 Smith PC, Araya-guerra R, Bublitz C, Parnes B, Dickinson LM, Van Vorst R, Westfall JM, Pace WD. Missing Clinical Information During Primary Care Visits. JAMA 2005; 293 (05) 565-571.
- 10 Joffe E, Byrne MJ, Reeder P, Herskovic JR, Johnson CW, Mccoy AB, Sittig DF, Bernstam E V. A benchmark comparison of deterministic and probabilistic methods for defining manual review datasets in duplicate records reconciliation. J Am Med Inform Assoc 2014; (21) 97-104.
- 11 Campbell KM, Deck D, Krupski A. Record linkage software in the public domain: a comparison of Link Plus, The Link King, and a “basic” deterministic algorithm. Health Informatics J 2008; 14 (01) 5-15.
- 12 Achimugu P, Soriyan A, Oluwagbemi O, Ajayi A. Record Linkage System in a Complex Relational Database – MINPHIS Example. Stud Health Technol Inform 2010; 160 (MEDINFO 2010): 1127-1130.
- 13 Sauleau EA, Paumier J-P, Buemi A. Medical record linkage in health information systems by approximate string matching and clustering. BMC Med Inform Decis Mak 2005; 05: 32.
- 14 Arellano MG, Weber GI. Issues in Identification and Linkage of Patient Records Across an Integrated Delivery System. J Healthc Inf Manag 1998; 12 (03) 43-52.
- 15 Hillestad R, Bigelow JH, Chaudhry B, Dreyer P, Greenberg MD, Meili RC, Ridgely MS, Rothenberg J, Taylor R. Identity Crisis: An Examination of the Costs and Benefits of a Unique Patient Identifier for the U.S. Health Care System. 2008
- 16 Yancey WE. Expected Number of Random Duplications Within or Between Lists. JSM 2010; 2010: 2938-2946.
- 17 Grannis SJ, Overhage JM, McDonald C. Real world performance of approximate string comparators for use in patient matching. Stud Health Technol Inform 2004; 107: 43-47.
- 18 McClellan MA. Duplicate Medical Records: A Survey of Twin Cities Healthcare Organizations. In: AMIA Annu Symp Proc; 2009: 421-5.
- 19 Zech J, Husk G, Moore T, Kuperman GJ, Shapiro JS. Identifying homelessness using health information exchange data. J Am Med Inform Assoc 2015; 22 (03) 682-687.
- 20 Social Security Administration. Social Security Death Master File (SSDMF) [Internet]. Available from: https://www.ssdmf.com
- 21 Social Security Administration. Requesting the Full Death Master File (DMF) [Internet]. Available from: http://www.ssa.gov/dataexchange/request_dmf.html
- 22 Download the Death Master File Free (SSDMF.info) [Internet]. Available from: http://ssdmf.info/download.html
- 23 United States Government Accountability Office. Social Security Death Data: Additional Action Needed to Address Data Errors and Federal Agency Access. 2013
- 24 Healthix. Healthix: About Us [Internet]. Available from: https://services.lipixportal.org/HealthixPortal/Home/About
- 25 Levenshtein V. Binary Codes Capable of Correcting Deletions, Insertions, and Reversals. Sov Phys Dokl 1966; 10 (08) 707-710.
- 26 Finnell JT, Overhage JM, Grannis S. All Health Care is Not Local: An Evaluation of the Distribution of Emergency Department Care Delivered in Indiana. In: AMIA Annu Symp Proc; 2011: 409-416.
- 27 Finnell JT, Overhage JM, Dexter PR, Perkins SM, Lane KA, Mcdonald CJ. Community Clinical Data Exchange for Emergency Medicine Patients. In: AMIA Annu Symp Proc; 2003: 235-238.
- 28 Lynch B, Arends W. Selection of a surname encoding procedure for the Statistical Reporting Service record linkage system. Washington, D.C: U.S. Department of Agriculture; 1977
- 29 New York State Board of Elections. Freedom of Information Requests [Internet]. Available from: http://www.elections.ny.gov/FoilRequests.html