Abstract
Objectives This article aims to determine possible improvements made by feature extraction methods
to the machine learning prediction methods for predicting 30-day hospital readmissions.
Methods The study evaluates five feature extraction methods including principal component
analysis (PCA), kernel principal component analysis (KPCA), isomap, Laplacian eigenmaps,
and locality preserving projections (LPPs) for improving the accuracy of nine machine
learning prediction methods in predicting 30-day hospital readmissions. The specific
prediction methods considered include logistic regression, Cox regression, linear
discriminant analysis, k-nearest neighbor (KNN), support vector machines (SVMs), bagged
trees, boosted trees, random forest, and artificial neural networks. All models are
developed in MATLAB and validated using area under the curve based on two population-based
data sets from partner hospitals.
Results Laplacian eigenmaps and isomap feature extraction provide the most improvement to
the readmission predictive accuracy of KNN, SVM, bagged trees, boosted trees, and
linear discriminant analysis methods. The results for artificial neural networks,
random forest, Cox regression, and logistic regression show improvement for only one
of the data sets. Also, PCA and LPP provided the best computation efficiency followed
by KPCA, Laplacian eigenmaps, and isomap.
Conclusion Feature extraction methods can improve the predictive performance of machine learning
methods for predicting readmissions. However, the improvement depended on the specific
choice of the prediction method, feature extraction method, and the complexity of
the data set features.
Keywords
readmission prediction - feature extraction - classification