Prediction of Sepsis and In-Hospital Mortality Using Electronic Health Records
31 January 2018
accepted: 05 June 2018
24 September 2018 (online)
Objectives: Our goal was to develop predictive models for sepsis and in-hospital mortality using electronic health records (EHRs). We showcased the efficiency of these algorithms in patients diagnosed with pneumonia, a group that is highly susceptible to sepsis.
Methods: We retrospectively analyzed the Health Facts® (HF) dataset to develop models to predict mortality and sepsis using the data from the first few hours after admission. In addition, we developed models to predict sepsis using the data collected in the last few hours leading to sepsis onset. We used the random forest classifier to develop the models.
Results: The data collected in the EHR system is generally sporadic, making feature extraction and selection difficult, affecting the accuracies of the models. Despite this fact, the developed models can predict sepsis and in-hospital mortality with accuracies of up to 65.26±0.33% and 68.64±0.48%, and sensitivities of up to 67.24±0.36% and 74.00±1.22%, respectively, using only the data from the first 12 hours after admission. The accuracies generally remain consistent for similar models developed using the data from the first 24 and 48 hours after admission. Lastly, the developed models can accurately predict sepsis patients (with up to 98.63±0.17% accuracy and 99.74%±0.13% sensitivity) using the data collected within the last 12 hours before sepsis onset. The results suggest that if such algorithms continuously monitor patients, they can identify sepsis patients in a manner comparable to current screening tools, such as the rulebased Systemic Inflammatory Response Syndrome (SIRS) criteria, while often allowing for early detection of sepsis shortly after admission.
Conclusions: The developed models showed promise in early prediction of sepsis, providing an opportunity for directing early intervention efforts to prevent/treat sepsis.
- 1 DeShazo JP, Hoffman MA. A comparison of a multistate inpatient EHR database to the HCUP Nationwide Inpatient Sample.. BMC Health Services Research 2015; 15 (01) 384.
- 2 Jawad I, Lukši[cacute] I Rafnsson SB. Assessing available information on the burden of sepsis: global estimates of incidence, prevalence and mortality.. Journal of Global Health 2012; 2 (01) 010404.
- 3 Bone RC, Balk RA, Cerra FB, Dellinger RP, Fein AM, Knaus WA, Schein RM, Sibbald WJ. Definitions for sepsis and organ failure and guidelines for the use of innovative therapies in sepsis.. Chest 1992; 101 (06) 1644-1655.
- 4 Singer M, Deutschman CS, Seymour CW, Shankar-Hari M, Annane D, Bauer M, Bellomo R, Bernard GR, Chiche J-D, Coopersmith CM. The third international consensus definitions for sepsis and septic shock (sepsis-3).. JAMA 2016; 315 (08) 801-810.
- 5 Marik PE, Taeb AM. SIRS, qSOFA and new sepsis definition.. Journal of Thoracic Disease 2017; 9 (04) 943.
- 6 Taylor RA, Pare JR, Venkatesh AK, Mowafi H, Melnick ER, Fleischman W, Hall MK. Prediction of In-hospital Mortality in Emergency Department Patients With Sepsis: A Local Big Data–Driven, Machine Learning Approach.. Academic Emergency Medicine 2016; 23 (03) 269-278.
- 7 Gultepe E, Green JP, Nguyen H, Adams J, Albertson T, Tagkopoulos I. From vital signs to clinical outcomes for patients with sepsis: a machine learning basis for a clinical decision support system.. J Am Med Inform Assoc 2014; 21 (02) 315-325.
- 8 Giuliano KK. Physiological monitoring for critically ill patients: testing a predictive model for the early detection of sepsis.. American Journal of Critical Care 2007; 16 (02) 122-130.
- 9 Gonçalves JM, Portela F, Santos MF, Silva Á, Machado J, Abelha A. Predict sepsis level in intensive medicine–data mining approach.. In: Rocha Á, Correia A, Wilson T, Stroetmann K. editors. Advances in Information Systems and Technologies.. Advances in Intelligent Systems and Computing, vol 206. Berlin, Heidelberg: Springer; 2013. p. 201-211.
- 10 International Classification of Diseases, Ninth Revision (ICD-9) [Internet]. Centers for Disease Control and Prevention [updated September 1, 2009; cited 2017 Oct 14]. Available from: http://www.cdc.gov/nchs/icd/icd9.htm .
- 11 Bleyer AJ, Vidya S, Russell GB, Jones CM, Sujata L, Daeihagh P, Hire D. Longitudinal analysis of one million vital signs in patients in an academic medical center.. Resuscitation 2011; 82 (11) 1387-1392.
- 12 Barron HV, Harr SD, Radford MJ, Wang Y, Krumholz HM. The association between white blood cell count and acute myocardial infarction mortality in patients 65 years of age: findings from the cooperative cardiovascular project.. Journal of the American College of Cardiology 2001; 38 (06) 1654-1661.
- 13 Hu G, Baker SP. An explanation for the recent increase in the fall death rate among older Americans: a subgroup analysis.. Public Health Reports 2012; 127 (03) 275-281.
- 14 Shannon CE. A mathematical theory of communication.. ACM SIGMOBILE Mobile Computing and Communications Review 2001; 5 (01) 3-55.
- 15 Breiman L. Random forests.. Machine Learning 2001; 45 (01) 5-32.
- 16 Haykin S, Network N. A comprehensive foundation.. Neural Networks 2004; 2 2004 41.
- 17 Tu JV. Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes.. Journal of Clinical Epidemiology 1996; 49 (11) 1225-1231.
- 18 Oliphant TE. Python for scientific computing.. Computing in Science & Engineering 2007; 9 (03) 10-20.
- 19 Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V. Scikit-learn: Machine learning in Python.. Journal of Machine Learning Research 2011; 12 Oct 2825-2830.
- 20 Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J. et al. TensorFlow: A System for LargeScale Machine Learning.. In: Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation (OSDI’16).. USENIX Association; Berkeley, CA, USA: 2016. p. 265-283.
- 21 Tan PN, Steinbach M, Kumar V. Introduction to data mining.. Boston: Pearson; 2006