An Online Tool for Correcting Performance Measures of Electronic Phenotyping Algorithms for Verification Bias

Ajay Bhasin; Sue Bielinski; Abel N. Kho; Nicholas Larson; Laura J. Rasmussen-Torvik

doi:10.1055/a-2402-5937

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00034447.xml

Download PDF

CC BY 4.0 · ACI open 2024; 08(02): e89-e93
DOI: 10.1055/a-2402-5937

Case Report

An Online Tool for Correcting Performance Measures of Electronic Phenotyping Algorithms for Verification Bias

Authors

Ajay Bhasin

¹Northwestern University Feinberg School of Medicine, Chicago, United States
Sue Bielinski

¹Northwestern University Feinberg School of Medicine, Chicago, United States
Abel N. Kho

¹Northwestern University Feinberg School of Medicine, Chicago, United States
Nicholas Larson

¹Northwestern University Feinberg School of Medicine, Chicago, United States
Laura J. Rasmussen-Torvik

¹Northwestern University Feinberg School of Medicine, Chicago, United States

Funding None.

Further Information

Also available at

Permissions and Reprints

Abstract

Objectives Computable or electronic phenotypes of patient conditions are becoming more commonplace in quality improvement and clinical research. During phenotyping algorithm validation, standard classification performance measures (i.e., sensitivity, specificity, positive predictive value, negative predictive value, and accuracy) are often employed. When validation is performed on a randomly sampled patient population, direct estimates of these measures are valid. However, studies will commonly sample patients conditional on the algorithm result prior to validation, leading to a form of bias known as verification bias.

Methods We illustrate validation study sampling design and naïve and bias-corrected validation performance through both a concrete example (1,000 cases, 100 noncases, 1:1 sampling on predicted status) and a more thorough simulation study under varied realistic scenarios. We additionally describe the development of a free web calculator to adjust estimates for people validating phenotyping algorithms.

Results In our illustrative example, naïve performance estimates corresponded to 0.942 sensitivity, 0.979 specificity, and 0.960 accuracy; these contrast proper estimates of 0.620 sensitivity, 0.999 specificity, and 0.944 accuracy after adjusting for verification bias using our free calculator. Our simulation results demonstrate increasing positive bias for sensitivity and negative bias for specificity as the disease prevalence approaches zero, with decreasing positive predictive value moderately exacerbating these biases.

Conclusion Novel computable phenotypes of patient conditions must account for verification bias when calculating performance measures of the algorithm. The performance measures may vary significantly based on disease prevalence in the source population so use of a free web calculator to adjust these measures is desirable.

Keywords

data analysis - evaluation - system improvement - statistical methods - data collection

Protection of Human and Animal Subjects

No human subjects were involved in the project.

Publication History

Received: 06 May 2024

Accepted: 18 July 2024

Article published online:
27 December 2024

© 2024. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution License, permitting unrestricted use, distribution, and reproduction so long as the original work is properly cited. (https://creativecommons.org/licenses/by/4.0/)

Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany

References
1 Richesson RL, Smerek MM, Blake Cameron C. A framework to support the sharing and reuse of computable phenotype definitions across health care delivery and clinical research applications. EGEMS (Wash DC) 2016; 4 (03) 1232

PubMed Search in Google Scholar
Download RIS citation
2 Bielinski SJ, Pathak J, Carrell DS. et al. A Robust e-epidemiology tool in phenotyping heart failure with differentiation for preserved and reduced ejection fraction: the Electronic Medical Records and Genomics (eMERGE) network. J Cardiovasc Transl Res 2015; 8 (08) 475-483

Crossref PubMed Search in Google Scholar
Download RIS citation
3 Carroll RJ, Thompson WK, Eyler AE. et al. Portability of an algorithm to identify rheumatoid arthritis in electronic health records. J Am Med Inform Assoc 2012; 19 (e1): e162-e169

Crossref PubMed Search in Google Scholar
Download RIS citation
4 Jackson KL, Mbagwu M, Pacheco JA. et al. Performance of an electronic health record-based phenotype algorithm to identify community associated methicillin-resistant Staphylococcus aureus cases and controls for genetic association studies. BMC Infect Dis 2016; 16 (01) 684

Crossref PubMed Search in Google Scholar
Download RIS citation
5 Kho AN, Hayes MG, Rasmussen-Torvik L. et al. Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study. J Am Med Inform Assoc 2012; 19 (02) 212-218

Crossref PubMed Search in Google Scholar
Download RIS citation
6 Gaffikin L, McGrath J, Arbyn M, Blumenthal PD. Avoiding verification bias in screening test evaluation in resource poor settings: a case study from Zimbabwe. Clin Trials 2008; 5 (05) 496-503

Crossref PubMed Search in Google Scholar
Download RIS citation
7 O'Sullivan JW, Banerjee A, Heneghan C, Pluddemann A. Verification bias. BMJ Evid Based Med 2018; 23 (02) 54-55

Crossref PubMed Search in Google Scholar
Download RIS citation
8 Hall MK, Kea B, Wang R. Recognising bias in studies of diagnostic tests part 1: patient selection. Emerg Med J 2019; 36 (07) 431-434

Crossref PubMed Search in Google Scholar
Download RIS citation
9 Begg CB, Greenes RA. Assessment of diagnostic tests when disease verification is subject to selection bias. Biometrics 1983; 39 (01) 207-215

Crossref PubMed Search in Google Scholar
Download RIS citation
10 Grunau G, Linn S. Commentary: sensitivity, specificity, and predictive values: foundations, pliabilities, and pitfalls in research and practice. Front Public Health 2018; 6: 256

Crossref PubMed Search in Google Scholar
Download RIS citation
11 Rasmussen-Torvik LJ, Furmanchuk A, Stoddard AJ. et al. The effect of number of healthcare visits on study sample selection in electronic health record data. Int J Popul Data Sci 2020; 5 (01) 5

Search in Google Scholar
Download RIS citation
12 Newton KM, Peissig PL, Kho AN. et al. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. J Am Med Inform Assoc 2013; 20 (e1): e147-e154

Crossref PubMed Search in Google Scholar
Download RIS citation
13 Desai JR, Wu P, Nichols GA, Lieu TA, O'Connor PJ. Diabetes and asthma case identification, validation, and representativeness when using electronic health data to construct registries for comparative effectiveness and epidemiologic research. Med Care 2012; 50 (00) S30-S35

Crossref PubMed Search in Google Scholar
Download RIS citation

Related Journals

Subscribe to RSS

Share / Bookmark

An Online Tool for Correcting Performance Measures of Electronic Phenotyping Algorithms for Verification Bias

Authors

Corrected by:

Abstract

Keywords

Protection of Human and Animal Subjects

Publication History

References