A New Paradigm to Analyze Data Completeness of Patient DataWe would like to thank Dr. Thomas Wan for his valuable guidance on this project. We would like to inform the readers that this article is an extended version of the paper published in the Proceedings of SDPS 2015 Annual Conference published by the authors and permission has been granted by the society to publish it in any journal.
26 April 2016
accepted: 04 July 2016
19 December 2017 (online)
There is a need to develop a tool that will measure data completeness of patient records using sophisticated statistical metrics. Patient data integrity is important in providing timely and appropriate care. Completeness is an important step, with an emphasis on understanding the complex relationships between data fields and their relative importance in delivering care. This tool will not only help understand where data problems are but also help uncover the underlying issues behind them.
Develop a tool that can be used alongside a variety of health care database software packages to determine the completeness of individual patient records as well as aggregate patient records across health care centers and subpopulations.
Methods: The methodology of this project is encapsulated within the Data Completeness Analysis Package (DCAP) tool, with the major components including concept mapping, CSV parsing, and statistical analysis.
The results from testing DCAP with Healthcare Cost and Utilization Project (HCUP) State Inpatient Database (SID) data show that this tool is successful in identifying relative data completeness at the patient, subpopulation, and database levels. These results also solidify a need for further analysis and call for hypothesis driven research to find underlying causes for data incompleteness.
DCAP examines patient records and generates statistics that can be used to determine the completeness of individual patient data as well as the general thoroughness of record keeping in a medical database. DCAP uses a component that is customized to the settings of the software package used for storing patient data as well as a Comma Separated Values (CSV) file parser to determine the appropriate measurements. DCAP itself is assessed through a proof of concept exercise using hypothetical data as well as available HCUP SID patient data.
Citation: Nasir A, Gurupur V, Liu X. A new paradigm to analyze data completeness of patient data.
- 1 Deeks J. Meta-Analysis, Decision Analysis And Cost-Effectiveness Analysis: Methods For Quantitative Synthesis In Medicine. Statist. Med. Statistics in Medicine 1996; 15 (14) 1601-1602.
- 2 Ozcan Y. Quantitative Methods in Health Care Management: Techniques and Applications. Jossey-Bass Public Health. (n.d.). 36.
- 3 Armitage P. Quantitative Methods In Biological And Medical Sciences: A Historical Essay. Statist. Med. Statistics in Medicine 1996; 15 (05) 562-563.
- 4 Stuart M. Public Health Issues in the Development of Centralized Health Care Databases. Journal of Public Health Management and Practice 1995; 01 (01) 57-62.
- 5 Strauss AT, Martinez DA, Garcia-Arce A, Taylor S, Mateja C, Fabri PJ, Zayas-Castro JL. A user needs assessment to inform health information exchange design and implementation. BMC medical informatics and decision making 2015; 15 (01) 1.
- 6 Martinez DA, Mora E, Gemmani M, Zayas-Castro J. Uncovering Hospitalists’ Information Needs from Outside Healthcare Facilities in the Context of Health Information Exchange Using Association Rule Learning. Applied Clinical Informatics 2015 06 (04) 684-697.
- 7 Ucf-rec.org. UCF Regional Extension Center | Patient Centered Medical Home (PCMH). 2015 [cited 5 March 2015]. Available from: http://ucf-rec.org/services/pcmh/s
- 8 Solimeo SL, Hein MPM, Ono S, Lampman M, Stewart GL. Medical homes require more than an EMR and aligned incentives. The American journal of managed care 2013; 19 (02) 132-140.
- 9 Jha AK, DesRoches CM, Campbell EG, Donelan K, Rao SR, Ferris TG, Shields A, Rosenbaum S, Blumenthal D. Use of electronic health records in U.S. hospitals. New England Journal of Medicine 2009; 360 (16) 1628-1638.
- 10 Walsh S. The clinician’s perspective on electronic health records and how they can affect patient care. BMJ 2004; 328 7449 1184-1187.
- 11 Shortliffe EH. The evolution of health-care records in the era of the Internet. Medinfo 1998; 98: 8-14.
- 12 vonKoss HKrowchuk, Moore ML, Richardson L. Using health care records as sources of data for research. Journal of Nursing Measurement 1995; 03 (01) 3-12.
- 13 Hogan W, Wagner M. Accuracy of Data in Computer-based Patient Records. Journal of the American Medical Informatics Association 1997; 04 (05) 342-355.
- 14 Dambro MR, Weiss BD. Assessing the quality of data entry in a computerized medical records system. J Med Syst 1988; 12: 181-187.
- 15 Weiskopf NG, Weng C. Methods and dimensions of electronic health record data quality assessment: Enabling reuse for clinical research. Journal of the American Medical Informatics Association 2012; 20 (01) 144-151.
- 16 Majid A, Car J, Sheikh A. Accuracy and completeness of electronic patient records in primary care: Family Practice. 2008; 25 (04) 213-214.
- 17 Barlow L, Westergren K, Holmberg L, Talbäck M. The completeness of the Swedish Cancer Register – a sample survey for year 1998. Acta Oncologica 2009; 48 (01) 27-33.
- 18 Warsi AA, White S, McCulloch P. Completeness of data entry in three cancer surgery databases. European Journal of Surgical Oncology 2002; 28 (08) 850-856.
- 19 Roos L, Roos N, Cageorge S, Nicol J. How Good Are the Data?. Medical Care (n.d.). 266-276.
- 20 Hassey A. A survey of validity and utility of electronic patient records in a general practice. BMJ 2001; 322 7299 1401-1405.
- 21 Arts DGT, De Keizer NF, Scheffer GJ. Defining and improving data quality in medical registries: A literature review, case study, and generic framework. Journal of the American Medical Informatics Association 2009; 09 (06) 600-611.
- 22 Munro B. Statistical methods for health care research. Philadelphia: Lippincott Williams & Wilkins; 2005.
- 23 Armitage P, Mainland D. Elementary Medical Statistics. Biometrika (n.d.). 281-281.
- 24 Lloyd S. Physician and Coding Errors in Patient Records. JAMA 1985; 254 (10) 1330.
- 25 Critchfield G. Data Entry for Computer-based Patient Records. Aspects of the Computer-based Patient Record (n.d.). 140-145.
- 26 Prather JC, Lobach DF, Goodwin LK, Hales JW, Hage ML, Hammond WE. Medical data mining: knowledge discovery in a clinical data warehouse. Proceedings of the AMIA Annual Fall Symposium 1997; 101-105.
- 27 Peute LW, Driest KF, Marcilly R, Bras Da SCosta, Beuscart-Zephir MC, Jaspers MW. A Framework for reporting on human factor/usability studies of health information technologies. Studies on Health Technologies and Informatics, IOS Press 2013; 194: 54-60.
- 28 Recommended Core Measures for Evaluating the Patient-Centered Medical Home: Cost, Utilization, and Clinical Quality Web. 06. 2015 [cited 5 March 2015]. Available from: http://www.commonwealthfund.org/∼/media/Files/Publications/Data Brief/2012/1601_Rosenthal_ recommended_core_measures_PCMH_v2.pdf
- 29 Fernald DH, Deaner N, O’Neill C, Jortberg BT, deGruy III FV, Perry DW. Overcoming early barriers to PCMH practice improvement in family medicine residencies. Family Medicine-Kansas City 2011; 43 (07) 503.
- 30 Patient-centered medical home: building evidence and momentum: a compilation of PCMH pilot and demonstration projects. Patient-Centered Primary Care Collaborative. 2008
- 31 Ensuring Data Integrity in Health Information Exchange. 2015 [cited 5 March 2015]. Available from: http://library.ahima.org/xpedio/groups/public/documents/ahima/bok1_049675.pdf
- 32 Q-centrix.com. Patient Data Accuracy Programs for the Healthcare Industry – Q-Centrix. 2015 [cited 5 March 2015]. Available from: http://www.q-centrix.com/why-q-centrix/our-promise/accuracy
- 33 eClinicalWorks. Population Health (CCMR) – eClinicalWorks. 2015 [cited 5 March 2015]. Available from: https://www.eclinicalworks.com/products-services/population-health-ccmr/
- 34 Lanzola G, Parimbelli E, Micieli G, Cavallini A, Quaglini S. Data Quality and Completeness in a Web Stroke Registry as the Basis for Data and Process Mining. Journal of Healthcare Engineering 2014; 05 (02) 163-184.
- 35 Dabbagh N. Concept Mapping as a Mindtool for Critical Thinking, Journal of Computing in Teacher Education. 2001; 17 (02) 16-23.
- 36 Jain GP, Gurupur V, Schroeder JL, Faulkenberry ED. Artificial Intelligence-Based Student Learning Evaluation: A Concept Map-Based Approach for Analyzing a Student’s Understanding of a Topic, IEEE Transactions on Learning Technologies. 2014 DOI: 10.1109/TLT.2014.2330297.
- 37 Gurupur V, Jain GP, Rudraraju R. Evaluating Student Learning Using Concept Maps and Markov Chains: Expert Systems with Applications. 2015; 42: 3306-3314.
- 38 Ahrq.gov. Healthcare Cost and Utilization Project (HCUP) | Agency for Healthcare Research & Quality. 2015 [cited 5 March 2015]. Available from: http://www.ahrq.gov/research/data/hcup/
- 39 SID Database Documentation. Retrieved June 14, 2015, from https://www.hcup-us.ahrq.gov/db/state/ siddbdocumentation.jsp
- 40 Two-Sample T-Test. (n.d.). Retrieved February 17, 2016, from http://www.ncss.com/wp-content/themes/ ncss/pdf/Procedures/NCSS/Two-Sample_T-Test.pdf
- 41 StatPac User’s Guide. (n.d.). Retrieved February 17, 2016, from https://statpac.com/manual/ index.htm?turl=compareasamplemeantoapopulationmean.htm
- 42 Gurupur V, Kamdi AS, Tuncer T, Tanik MM, Tanju MN. Enhancing Medical Research Efficiency Using Concept Maps, Editor: Arabnia HR. Advances in Experimental Medicine and Biology, Springer 2011; 696 (Part 7): 581-588.
- 43 MetaMap – A Tool For Recognizing UMLS Concepts in Text. (n.d.). Retrieved March 06, 2016, from https://metamap.nlm.nih.gov/
- 44 Nasir A, Gurupur V, Liu X, Qureshi X. Managing Healthcare Patient Data Using the Data Completeness Analysis Package (DCAP), Proceedings of SDPS 2015 Conference. November 1- 5, 2015. Fort Worth, TX.: