A Rule-Based Data Quality Assessment System for Electronic Health Record Data
Objective Rule-based data quality assessment in health care facilities was explored through compilation, implementation, and evaluation of 63,397 data quality rules in a single-center case study to assess the ability of rules-based data quality assessment to identify data errors of importance to physicians and system owners.
Methods We applied a design science framework to design, demonstrate, test, and evaluate a scalable framework with which data quality rules can be managed and used in health care facilities for data quality assessment and monitoring.
Results We identified 63,397 rules partitioned into 28 logic templates. A total of 819,683 discrepancies were identified by 4.5% of the rules. Nine out of 11 participating clinical and operational leaders indicated that the rules identified data quality problems and articulated next steps that they wanted to take based on the reported information.
Discussion The combined rule template and knowledge table approach makes governance and maintenance of otherwise large rule sets manageable. Identified challenges to rule-based data quality monitoring included the lack of curated and maintained knowledge sources relevant to data error detection and lack of organizational resources to support clinical and operational leaders with investigation and characterization of data errors and pursuit of corrective and preventative actions. Limitations of our study included implementation within a single center and dependence of the results on the implemented rule set.
Conclusion This study demonstrates a scalable framework (up to 63,397 rules) with which data quality rules can be implemented and managed in health care facilities to identify data errors. The data quality problems identified at the implementation site were important enough to prompt action requests from clinical and operational leaders.
Protection of Human and Animal Subjects
The authors declare that human and/or animal subjects were not included in the project.
Received: 02 March 2020
Accepted: 06 July 2020
23 September 2020 (online)
Georg Thieme Verlag KG
Stuttgart · New York
- 1 Forrest Jr WH, Bellville JW. The use of computers in clinical trials. Br J Anaesth 1967; 39 (04) 311-319
- 2 Kronmal RA, Davis K, Fisher LD, Jones RA, Gillespie MJ. Data management for a large collaborative clinical trial (CASS: Coronary Artery Surgery Study). Comput Biomed Res 1978; 11 (06) 553-566
- 3 Knatterud GL. Methods of quality control and of continuous audit procedures for controlled clinical trials. Control Clin Trials 1981; 1 (04) 327-332
- 4 Norton SL, Buchanan AV, Rossmann DL, Chakraborty R, Weiss KM. Data entry errors in an on-line operation. Comput Biomed Res 1981; 14 (02) 179-198
- 5 Cato AE, Cloutier G, Cook L. Data entry design and data quality. 1985
- 6 Bagniewska A, Black D, Molvig K. et al Data quality in a distributed data processing system: the SHEP Pilot Study. Control Clin Trials 1986; 7 (01) 27-37
- 7 DuChene AG, Hultgren DH, Neaton JD. et al Forms control and error detection procedures used at the Coordinating Center of the Multiple Risk Factor Intervention Trial (MRFIT). Control Clin Trials 1986; 7 (03) 34S-45S
- 8 Crombie IK, Irving JM. An investigation of data entry methods with a personal computer. Comput Biomed Res 1986; 19 (06) 543-550
- 9 Fortmann SP, Haskell WL, Williams PT, Varady AN, Hulley SB, Farquhar JW. Community surveillance of cardiovascular diseases in the Stanford Five-City Project. Methods and initial experience. Am J Epidemiol 1986; 123 (04) 656-669
- 10 Houston L, Probst Y, Yu P, Martin A. Exploring data quality management within clinical trials. Appl Clin Inform 2018; 9 (01) 72-81
- 11 Joukes E, de Keizer NF, de Bruijne MC, Abu-Hanna A, Cornet R. Impact of electronic versus paper-based recording before EHR implementation on health care professionals' perceptions of EHR use, data quality, and data reuse. Appl Clin Inform 2019; 10 (02) 199-209
- 12 Reimer AP, Milinovich A, Madigan EA. Data quality assessment framework to assess electronic medical record data for use in research. Int J Med Inform 2016; 90: 40-47
- 13 Huser V, DeFalco FJ, Schuemie M. et al Multisite evaluation of a data quality tool for patient-level clinical data sets. EGEMS (Wash DC) 2016; 4 (01) 1239
- 14 Sengupta S, Bachman D, Laws R. et al Data quality assessment and multi-organizational reporting: tools to enhance network knowledge. EGEMS (Wash DC) 2019; 7 (01) 8
- 15 Lee K, Weiskopf N, Pathak J. “A framework for data quality assessment in clinical research datasets.” AMIA Annual Symposium Proceedings. Vol. 2017. American Medical Informatics Association; 2017
- 16 Houston ML. “Defining and Developing a Generic Framework for Monitoring Data Quality in Clinical Research.” AMIA Annual Symposium Proceedings. Vol. 2018. American Medical Informatics Association; 2018
- 17 Feder SL. Data quality in electronic health records research: quality domains and assessment methods. West J Nurs Res 2018; 40 (05) 753-766
- 18 Pezoulas VC, Kourou KD, Kalatzis F. et al Medical data quality assessment: on the development of an automated framework for medical data curation. Comput Biol Med 2019; 107: 270-283
- 19 Scholte M, van Dulmen SA, Neeleman-Van der Steen CW, van der Wees PJ, Nijhuis-van der Sanden MW, Braspenning J. Data extraction from electronic health records (EHRs) for quality measurement of the physical therapy process: comparison between EHR data and survey data. BMC Med Inform Decis Mak 2016; 16 (01) 141
- 20 Callahan TJ, Bauck AE, Bertoch D. et al A comparison of data quality assessment checks in six data sharing networks. EGEMS (Wash DC) 2017; 5 (01) 8
- 21 Ehrlinger L, Rusz E, Wöß W. “A Survey of Data Quality Measurement and Monitoring Tools.” arXiv preprint arXiv:1907.08138; 2019
- 22 Carlson D, Wallace CJ, East TD, Morris AH. Verification & validation algorithms for data used in critical care decision support systems. Proc Annu Symp Comput Appl Med Care 1995; •••: 188-192
- 23 Brown PJ, Warmington V. Data quality probes-exploiting and improving the quality of electronic patient record data and patient care. Int J Med Inform 2002; 68 (1-3): 91-98
- 24 Kahn MG, Raebel MA, Glanz JM, Riedlinger K, Steiner JF. A pragmatic framework for single-site and multisite data quality assessment in electronic health record-based clinical research. Med Care 2012; 50: S21-S29
- 25 Wang RY, Strong DM. Beyond accuracy: what data quality means to data consumers. J Manage Inf Syst 1996; 12 (04) 5-33
- 26 Kahn MG, Callahan TJ, Barnard J. et al A harmonized data quality assessment terminology and framework for the secondary use of electronic health record data. EGEMS (Wash DC) 2016; 4 (01) 1244
- 27 Hart R, Kuo MH. Better data quality for better healthcare research results - a case study. Stud Health Technol Inform 2017; 234 (234) 161-166
- 28 Skyttberg N, Chen R, Blomqvist H, Koch S. Exploring vital sign data quality in electronic health records with focus on emergency care warning scores. Appl Clin Inform 2017; 8 (03) 880-892
- 29 Johnson SG, Speedie S, Simon G, Kumar V, Westra BL. Quantifying the effect of data quality on the validity of an eMeasure. Appl Clin Inform 2017; 8 (04) 1012-1021
- 30 Bauck A, Bachman D, Riedlinger K. et al C-A1-02: Developing a Structure for Programmatic Quality Assurance Checks on the Virtual Data Warehouse. Clin Med Res 2011; 9 (3-4): 184
- 31 Curtis LH, Weiner MG, Beaulieu NU, Rosofsky RA, Woodworth TS, Boudreau DM. Mini-Sentinel year 1 common data model—data core activities. 2012 Accessed July 21, 2020 at: http://www.mini-sentinel.org/data_activities/details.aspx?ID=128
- 32 Tenenbaum JD, Christian V, Cornish MA. et al The MURDOCK Study: a long-term initiative for disease reclassification through advanced biomarker discovery and integration with electronic health records. Am J Transl Res 2012; 4 (03) 291-301
- 33 Jenders RA, Huang H, Hripcsak G, Clayton PD. Evolution of a knowledge base for a clinical decision support system encoded in the Arden Syntax. Proc AMIA Symp 1998; 558-562