Appl Clin Inform 2020; 11(04): 622-634
DOI: 10.1055/s-0040-1715567
Research Article

A Rule-Based Data Quality Assessment System for Electronic Health Record Data

Zhan Wang
1  Department of Population Health Science, University of Texas Health Science Center at San Antonio, San Antonio, Texas, United States
,
John R. Talburt
2  Department of Information Science, University of Arkansas at Little Rock, Little Rock, Arkansas, United States
,
Ningning Wu
2  Department of Information Science, University of Arkansas at Little Rock, Little Rock, Arkansas, United States
,
Serhan Dagtas
2  Department of Information Science, University of Arkansas at Little Rock, Little Rock, Arkansas, United States
,
Meredith Nahm Zozus
1  Department of Population Health Science, University of Texas Health Science Center at San Antonio, San Antonio, Texas, United States
› Author Affiliations

Abstract

Objective Rule-based data quality assessment in health care facilities was explored through compilation, implementation, and evaluation of 63,397 data quality rules in a single-center case study to assess the ability of rules-based data quality assessment to identify data errors of importance to physicians and system owners.

Methods We applied a design science framework to design, demonstrate, test, and evaluate a scalable framework with which data quality rules can be managed and used in health care facilities for data quality assessment and monitoring.

Results We identified 63,397 rules partitioned into 28 logic templates. A total of 819,683 discrepancies were identified by 4.5% of the rules. Nine out of 11 participating clinical and operational leaders indicated that the rules identified data quality problems and articulated next steps that they wanted to take based on the reported information.

Discussion The combined rule template and knowledge table approach makes governance and maintenance of otherwise large rule sets manageable. Identified challenges to rule-based data quality monitoring included the lack of curated and maintained knowledge sources relevant to data error detection and lack of organizational resources to support clinical and operational leaders with investigation and characterization of data errors and pursuit of corrective and preventative actions. Limitations of our study included implementation within a single center and dependence of the results on the implemented rule set.

Conclusion This study demonstrates a scalable framework (up to 63,397 rules) with which data quality rules can be implemented and managed in health care facilities to identify data errors. The data quality problems identified at the implementation site were important enough to prompt action requests from clinical and operational leaders.

Protection of Human and Animal Subjects

The authors declare that human and/or animal subjects were not included in the project.




Publication History

Received: 02 March 2020

Accepted: 06 July 2020

Publication Date:
23 September 2020 (online)

Georg Thieme Verlag KG
Stuttgart · New York