Methods Inf Med 2019; 58(06): 213-221
DOI: 10.1055/s-0040-1702159
Original Article
Georg Thieme Verlag KG Stuttgart · New York

Analysis of Feature Extraction Methods for Prediction of 30-Day Hospital Readmissions

Joel Sumner
1  Department of Mechanical Engineering, The University of Texas at San Antonio, San Antonio, Texas, United States
,
Adel Alaeddini
1  Department of Mechanical Engineering, The University of Texas at San Antonio, San Antonio, Texas, United States
› Institutsangaben
Weitere Informationen

Publikationsverlauf

01. Mai 2019

31. Dezember 2019

Publikationsdatum:
29. April 2020 (online)

Abstract

Objectives This article aims to determine possible improvements made by feature extraction methods to the machine learning prediction methods for predicting 30-day hospital readmissions.

Methods The study evaluates five feature extraction methods including principal component analysis (PCA), kernel principal component analysis (KPCA), isomap, Laplacian eigenmaps, and locality preserving projections (LPPs) for improving the accuracy of nine machine learning prediction methods in predicting 30-day hospital readmissions. The specific prediction methods considered include logistic regression, Cox regression, linear discriminant analysis, k-nearest neighbor (KNN), support vector machines (SVMs), bagged trees, boosted trees, random forest, and artificial neural networks. All models are developed in MATLAB and validated using area under the curve based on two population-based data sets from partner hospitals.

Results Laplacian eigenmaps and isomap feature extraction provide the most improvement to the readmission predictive accuracy of KNN, SVM, bagged trees, boosted trees, and linear discriminant analysis methods. The results for artificial neural networks, random forest, Cox regression, and logistic regression show improvement for only one of the data sets. Also, PCA and LPP provided the best computation efficiency followed by KPCA, Laplacian eigenmaps, and isomap.

Conclusion Feature extraction methods can improve the predictive performance of machine learning methods for predicting readmissions. However, the improvement depended on the specific choice of the prediction method, feature extraction method, and the complexity of the data set features.