Using Machine Learning to Capture Quality Metrics from Natural Language: A Case Study of Diabetic Eye Exams

Allan Fong; Nicholas Scoulios; H. Joseph Blumenthal; Ryan E. Anderson

doi:10.1055/s-0041-1736311

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00035037.xml

Share / Bookmark

Facebook X Linkedin Weibo

Download PDF

Methods Inf Med 2021; 60(03/04): 110-115
DOI: 10.1055/s-0041-1736311

Short Paper

Using Machine Learning to Capture Quality Metrics from Natural Language: A Case Study of Diabetic Eye Exams

Allan Fong

¹National Center for Human Factors in Healthcare, MedStar Health, Washington, District of Columbia, United States

,

Nicholas Scoulios

²Department of Hospital Medicine, Internal Medicine, Standford University School of Medicine, Stanford, California, United States

,

H. Joseph Blumenthal

¹National Center for Human Factors in Healthcare, MedStar Health, Washington, District of Columbia, United States

,

Ryan E. Anderson

³Division of General Internal Medicine, Department of Medicine, MedStar Georgetown University Hospital, Washington, District of Columbia, United States

⁴MedStar Institute for Quality and Safety, MedStar Health Research Institute, MedStar Health, Washington, District of Columbia, United States

› Author Affiliations Funding None.

› Further Information

Abstract
Full Text
References

Permissions and Reprints

Abstract

Background and Objective The prevalence of value-based payment models has led to an increased use of the electronic health record to capture quality measures, necessitating additional documentation requirements for providers.

Methods This case study uses text mining and natural language processing techniques to identify the timely completion of diabetic eye exams (DEEs) from 26,203 unique clinician notes for reporting as an electronic clinical quality measure (eCQM). Logistic regression and support vector machine (SVM) using unbalanced and balanced datasets, using the synthetic minority over-sampling technique (SMOTE) algorithm, were evaluated on precision, recall, sensitivity, and f1-score for classifying records positive for DEE. We then integrate a high precision DEE model to evaluate free-text clinical narratives from our clinical EHR system.

Results Logistic regression and SVM models had comparable f1-score and specificity metrics with models trained and validated with no oversampling favoring precision over recall. SVM with and without oversampling resulted in the best precision, 0.96, and recall, 0.85, respectively. These two SVM models were applied to the unannotated 31,585 text segments representing 24,823 unique records and 13,714 unique patients. The number of records classified as positive for DEE using the SVM models ranged from 667 to 8,935 (2.7–36% out of 24,823, respectively). Unique patients classified as positive for DEE ranged from 3.5 to 41.8% highlighting the potential utility of these models.

Discussion We believe the impact of oversampling on SVM model performance to be caused by the potential of overfitting of the SVM SMOTE model on the synthesized data and the data synthesis process. However, the specificities of SVM with and without SMOTE were comparable, suggesting both models were confident in their negative predictions. By prioritizing to implement the SVM model with higher precision over sensitivity or recall in the categorization of DEEs, we can provide a highly reliable pool of results that can be documented through automation, reducing the burden of secondary review. Although the focus of this work was on completed DEEs, this method could be applied to completing other necessary documentation by extracting information from natural language in clinician notes.

Conclusion By enabling the capture of data for eCQMs from documentation generated by usual clinical practice, this work represents a case study in how such techniques can be leveraged to drive quality without increasing clinician work.

Keywords

natural language processing - text mining - quality metrics - electronic clinical quality measure - diabetic eye exams

Publication History

Received: 09 April 2021

Accepted: 25 August 2021

Article published online:
01 October 2021

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

References
1 Hsiao C-J, Jha AK, King J, Patel V, Furukawa MF, Mostashari F. Office-based physicians are responding to incentives and assistance by adopting and using electronic health records. Health Aff (Millwood) 2013; 32 (08) 1470-1477

Crossref PubMed Google Scholar
2 Edwards ST, Bitton A, Hong J, Landon BE. Patient-centered medical home initiatives expanded in 2009-13: providers, patients, and payment incentives increased. Health Aff (Millwood) 2014; 33 (10) 1823-1831

Crossref PubMed Google Scholar
3 Friedberg MW, Schneider EC, Rosenthal MB, Volpp KG, Werner RM. Association between participation in a multipayer medical home intervention and changes in quality, utilization, and costs of care. JAMA 2014; 311 (08) 815-825

Crossref PubMed Google Scholar
4 Friedberg MW, Chen PG, Van Busum KR. et al. Factors affecting physician professional satisfaction and their implications for patient care, health systems, and health policy. Rand Heal Q 2014; 3 (04) 1

PubMed Google Scholar
5 Association AD. American Diabetes Association. Economic costs of diabetes in the U.S. in 2012. Diabetes Care 2013; 36 (04) 1033-1046

Crossref PubMed Google Scholar
6 Medicare Shared Savings Program Quality Measure Benchmarks for the 2018 and 2019 Reporting Years: Guidance Document. Baltimore, MD; 2017

PubMed Google Scholar
7 Medicare 2018 Part C & D Star Ratings Technical Notes. Baltimore, MD; 2017. Accessed November 1, 2018 at: https://www.cms.gov/Medicare/Prescription-Drug-Coverage/PrescriptionDrugCovGenIn/Downloads/2018-Star-Ratings-Technical-Notes-2017_09_06.pdf

PubMed Google Scholar
8 Chapman WW, Nadkarni PM, Hirschman L, D'Avolio LW, Savova GK, Uzuner O. Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions. J Am Med Inform Assoc 2011; 18 (05) 540-543

Crossref PubMed Google Scholar
9 Nadkarni PM, Ohno-Machado L, Chapman WW. Natural language processing: an introduction. J Am Med Inform Assoc 2011; 18 (05) 544-551

Crossref PubMed Google Scholar
10 Kelahan LC, Fong A, Ratwani RM, Filice RW. Call case dashboard: tracking R1 exposure to high-acuity cases using natural language processing. J Am Coll Radiol 2016; 13 (08) 988-991

Crossref PubMed Google Scholar
11 Wilcox AB, Hripcsak G. The role of domain knowledge in automating medical text report classification. J Am Med Inform Assoc 2003; 10 (04) 330-338

Crossref PubMed Google Scholar
12 Lakhani P, Kim W, Langlotz CP. Automated detection of critical results in radiology reports. J Digit Imaging 2012; 25 (01) 30-36

Crossref PubMed Google Scholar
13 McGlynn EA, Schneider EC, Kerr EA. Reimagining quality measurement. N Engl J Med 2014; 371 (23) 2150-2153

Crossref PubMed Google Scholar
14 Kerr EA, Hayward RA. Patient-centered performance management: enhancing value for patients and health care systems. JAMA 2013; 310 (02) 137-138

Crossref PubMed Google Scholar
15 Kunneman M, Montori VM, Shah ND. Measurement with a wink. BMJ Qual Saf 2017; 26 (10) 849-851

Crossref PubMed Google Scholar

Subscribe to RSS

Share / Bookmark

Using Machine Learning to Capture Quality Metrics from Natural Language: A Case Study of Diabetic Eye Exams

Abstract

Keywords

Publication History

References