Summary
Objectives:
Record linkage, the process of bringing together separately compiled but related
records from different databases, is essential in many areas of biomedical research.
We developed a record linkage program (EpiLink), which employs a simple mathematical
approach. We describe the program and present results obtained testing it in a linkage
task.
Methods:
EpiLink was designed to be flexible with user-friendly settings to tailor linkage
and operating parameters to specific linkage tasks, and employ deterministic, probabilistic
or sequential deterministic-probabilistic linkage strategies as required. The user
can also standardize data format, examine linkage results and accept or discard them.
We used EpiLink to link a subset of cases of the Lombardy Cancer Registry (20,724
records) with the Social Security file of the population (1,021,846 records) covered
by the registry. The linkage strategy was deterministic, followed by several probabilistic
linkage steps.
Results:
Manual inspection of the results showed that EpiLink achieved 98.8% specificity and
96.5% sensitivity.
Conclusions:
EpiLink is a practical and accurate means of linking records from different databases
that can be used by non-statisticians and is efficient in terms of human and financial
resources.
Keywords
Cancer registry - record linkage - computer program - follow-up methods