Abstract:
While natural language processing systems are beginning to see clinical use, it remains
unclear whether they can be disseminated effectively through the health care community.
MedLEE, a general-purpose natural language processor developed for Columbia-Presbyterian
Medical Center, was compared to physicians' ability to detect seven clinical conditions
in 200 Brigham and Women's Hospital chest radiograph reports. Using the system on
the new institution's reports resulted in a small but measurable drop in performance
(it was distinguishable from physicians at p = 0.011). By making adjustments to the
interpretation of the processor's coded output (without changing the processor itself),
local behavior was better accommodated, and performance improved so that it was indistinguishable
from the physicians. Pairs of physicians disagreed on at least one condition for 22%
of reports; the source of disagreement appeared to be interpretation of findings,
gauging likelihood and degree of disease, and coding errors.
Keywords
Natural Language Processing - Computerized Medical Record Systems - Thoracic Radiography;
Evaluation