Summary
Objectives:
The usability of terminological systems (TSs) strongly depends on the coverage and
correctness of their content. The objective of this study was to create a literature
overview of aspects related to the content of TSs and of methods for the evaluation
of the content of TSs. The extent to which these methods overlap or complement each
other is investigated.
Methods:
We reviewed literature and composed definitions for aspects of the evaluation of
the content of TSs. Of the methods described in literature three were selected: 1)
Concept matching in which two samples of concepts representing a) documentation of
reasons for admission in daily care practice and b) aggregation of patient groups
for research, are looked up in the TS in order to assess its coverage; 2) Formal algorithmic
evaluation in which reasoning on the formally represented content is used to detect
inconsistencies; and 3) Expert review in which a random sample of concepts are checked
for incorrect and incomplete terms and relations. These evaluation methods were applied
in a case study on the locally developed TS DICE (Diagnoses for Intensive Care Evaluation).
Results:
None of the applied methods covered all the aspects of the content of a TS. The results
of concept matching differed for the two use cases (63% vs. 52% perfect matches).
Expert review revealed many more errors and incompleteness than formal algorithmic
evaluation.
Conclusions:
To evaluate the content of a TS, using a combination of evaluation methods is preferable.
Different representative samples, reflecting the uses of TSs, lead to different results
for concept matching. Expert review appears to be very valuable, but time consuming.
Formal algorithmic evaluation has the potential to decrease the workload of human
reviewers but detects only logical inconsistencies. Further research is required to
exploit the potentials of formal algorithmic evaluation.
Keywords
Terminological systems - evaluation - methods - definitions