Methods Inf Med 2001; 40(03): 196-203
DOI: 10.1055/s-0038-1634155
Original Article
Schattauer GmbH

Probabilistic Record Linkage: Relationships between File Sizes, Identifiers, and Match Weights

L. J. Cook
1   Intermountain Injury Control Research Center, Department of Pediatrics, University of Utah School of Medicine, Salt Lake City, Utah, USA
,
L. M. Olson
1   Intermountain Injury Control Research Center, Department of Pediatrics, University of Utah School of Medicine, Salt Lake City, Utah, USA
,
J. M. Dean
1   Intermountain Injury Control Research Center, Department of Pediatrics, University of Utah School of Medicine, Salt Lake City, Utah, USA
› Author Affiliations
Further Information

Publication History

Publication Date:
07 February 2018 (online)

Preview

Abstract:

This study investigates relationships between file sizes, amounts of information contained in commonly used record linkage variables, and the amount of information needed for a successful probabilistic linkage project. We present an equation predicting the amount of information needed for a successful linkage project. Match weights for variables commonly used in record linkage are measured using artificially created databases. Linkage algorithms were successful when the sum of minimum weights for variables used in a linkage exceeded the predicted cutoff. Linkage results were acceptable when this sum was near the predicted cutoff. This technique enables researchers to determine if enough information exists to perform a successful probabilistic linkage.