Refining a Machine Learning Model for Predicting Infant Sepsis: A Multidisciplinary Team Supported by Human-Centered Design Methods

Dean Karavite; Lusha Cao; Mary C. Harris; Alex Fidel; Lyle Ungar; Gerald Shaeffer; Rui Xiao; Patrick Brady; Heather C. Kaplan; Robert W. Grundmeier

doi:10.1055/a-2618-4470

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00035026.xml

Download PDF

Appl Clin Inform 2025; 16(04): 1332-1340
DOI: 10.1055/a-2618-4470

Case Report

Refining a Machine Learning Model for Predicting Infant Sepsis: A Multidisciplinary Team Supported by Human-Centered Design Methods

Authors

Dean Karavite

¹Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States
Lusha Cao

¹Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States
Mary C. Harris

²Department of Neonatology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States

³Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, United States
Alex Fidel

¹Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States
Lyle Ungar

⁴Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, United States
Gerald Shaeffer

¹Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States
Rui Xiao

⁵Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States
Patrick Brady

⁶Cincinnati Children's Hospital Medical Center, Department of Pediatrics, Perinatal Institute and James M. Anderson Center for Health Systems Excellence, Cincinnati, Ohio, United States

⁷Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States
Heather C. Kaplan

⁶Cincinnati Children's Hospital Medical Center, Department of Pediatrics, Perinatal Institute and James M. Anderson Center for Health Systems Excellence, Cincinnati, Ohio, United States

⁷Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States
Robert W. Grundmeier

¹Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States

³Department of Pediatrics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, United States

Funding This study was funded by the Foundation for the National Institutes of Health (fund no.: 1R01LM013526-01A1).

Further Information

Also available at

Permissions and Reprints

Abstract

Background

Human-centered design (HCD) methods in machine learning generally focus on workflow, user interfaces, and data visualizations, but there is the potential to apply these methods to inform the model development and testing process.

Objectives

This study aimed to demonstrate the potential of HCD methods to support the design and testing of machine learning models developed for clinical decision-making.

Methods

In preparing for formative user testing of clinician facing representations of a machine learning model for detecting sepsis in neonatal intensive care unit (NICU) patients, we discovered that interactive low fidelity mockups using real patient data revealed potential model anomalies. To further investigate these potential anomalies, we utilized the qualitative analysis of interviews with 31 NICU clinicians concerning their experience with neonatal sepsis. The review process was conducted by a multidisciplinary team with members having expertise in neonatology, informatics, data science, and human computer interaction (HCI). Anomalies identified via the mockups and interview analysis were further analyzed by inspections of patient charts and model features and code.

Results

The HCD-facilitated review revealed anomalies in three categories: (1) feature inclusion and exclusion, (2) feature importance, and (3) model stability over time. Data entry errors in the electronic health record and their impact on model output were also noted. The review resulted in 41 changes to the model.

Conclusion

The discovery of over 41 opportunities to improve our prediction model was a serendipitous by-product of the HCD process. Our results suggest that HCD can be applied not only to model display design and measures of explainability, but to the development and evaluation of the model itself. This case report also demonstrates the need for a multidisciplinary team of clinicians, data scientists, and HCI experts in identifying and addressing issues involving machine learning model performance.

Keywords

artificial intelligence - clinical decision support - interfaces and usability - machine learning - neonatology

Protection of Human and Animal Subjects

The study was determined to be exempt from human studies by the Children's Hospital of Philadelphia, Institutional Review Board (IRB 21-018777).

Publication History

Received: 30 December 2024

Accepted: 21 May 2025

Article published online:
10 October 2025

Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany

References
1 Zhang J, Walji MF. TURF: toward a unified framework of EHR usability. J Biomed Inform 2011; 44 (06) 1056-1067

Reference Link Ris
Crossref PubMed Search in Google Scholar
2 Margetis G, Ntoa S, Antona M, Stephanidis C. Human-centered design of artificial intelligence. In: Salvendy G, Karwowski W. eds. Handbook of Human Factors And Ergonomics. 1st ed.. Wiley; 2021: 1085-1106

Reference Link Ris
Search in Google Scholar
3 Gillies M, Fiebrink R, Tanaka A. et al. Human-centered machine learning. In: Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems. ACM; 2016: 3558-3565

Reference Link Ris
PubMed Search in Google Scholar
4 Wu X, Xiao L, Sun Y, Zhang J, Ma T, He L. A survey of human-in-the-loop for machine learning. Future Gener Comput Syst 2022; 135: 364-381

Reference Link Ris
Crossref PubMed Search in Google Scholar
5 Mosqueira-Rey E, Hernández-Pereira E, Alonso-Ríos D, Bobes-Bascarán J, Fernández-Leal Á. Human-in-the-loop machine learning: a state of the art. Artif Intell Rev 2023; 56 (04) 3005-3054

Reference Link Ris
Crossref PubMed Search in Google Scholar
6 Holzinger A. Interactive machine learning for health informatics: when do we need the human-in-the-loop?. Brain Inform 2016; 3 (02) 119-131

Reference Link Ris
Crossref PubMed Search in Google Scholar
7 Gennatas ED, Friedman JH, Ungar LH. et al. Expert-augmented machine learning. Proc Natl Acad Sci U S A 2020; 117 (09) 4571-4577

Reference Link Ris
Crossref PubMed Search in Google Scholar
8 Valdes G, Luna JM, Eaton E, Simone II CB, Ungar LH, Solberg TD. MediBoost: a patient stratification tool for interpretable decision making in the era of precision medicine. Sci Rep 2016; 6: 37854

Reference Link Ris
Crossref PubMed Search in Google Scholar
9 Zea-Vera A, Ochoa TJ. Challenges in the diagnosis and management of neonatal sepsis. J Trop Pediatr 2015; 61 (01) 1-13

Reference Link Ris
Crossref PubMed Search in Google Scholar
10 Wang H, Bhutta ZA, Coates MM. et al; GBD 2015 Child Mortality Collaborators. Global, regional, national, and selected subnational levels of stillbirths, neonatal, infant, and under-5 mortality, 1980-2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet 2016; 388 (10053): 1725-1774

Reference Link Ris
Crossref PubMed Search in Google Scholar
11 Li J, Xiang L, Chen X. et al. Global, regional, and national burden of neonatal sepsis and other neonatal infections, 1990-2019: findings from the Global Burden of Disease Study 2019. Eur J Pediatr 2023; 182 (05) 2335-2343

Reference Link Ris
Crossref PubMed Search in Google Scholar
12 Teng AK, Wilcox AB. A review of predictive analytics solutions for sepsis patients. Appl Clin Inform 2020; 11 (03) 387-398

Reference Link Ris
Thieme Connect PubMed Search in Google Scholar
13 Stoll BJ, Hansen NI, Sánchez PJ. et al; Eunice Kennedy Shriver National Institute of Child Health and Human Development Neonatal Research Network. Early onset neonatal sepsis: the burden of group B Streptococcal and E. coli disease continues. Pediatrics 2011; 127 (05) 817-826

Reference Link Ris
Crossref PubMed Search in Google Scholar
14 Qazi SA, Stoll BJ. Neonatal sepsis: a major global public health challenge. Pediatr Infect Dis J 2009; 28 (1, Suppl): S1-S2

Reference Link Ris
Crossref PubMed Search in Google Scholar
15 Srinivasan L, Harris MC. New technologies for the rapid diagnosis of neonatal sepsis. Curr Opin Pediatr 2012; 24 (02) 165-171

Reference Link Ris
Crossref PubMed Search in Google Scholar
16 Wynn JL. Defining neonatal sepsis. Curr Opin Pediatr 2016; 28 (02) 135-140

Reference Link Ris
Crossref PubMed Search in Google Scholar
17 Chen Q, Li R, Lin C. et al. SEPRES: intensive care unit clinical data integration system to predict sepsis. Appl Clin Inform 2023; 14 (01) 65-75

Reference Link Ris
Thieme Connect PubMed Search in Google Scholar
18 Dewan M, Vidrine R, Zackoff M. et al. Design, Implementation, and Validation of a Pediatric ICU sepsis prediction tool as clinical decision support. Appl Clin Inform 2020; 11 (02) 218-225

Reference Link Ris
Thieme Connect PubMed Search in Google Scholar
19 Kitzmiller RR, Vaughan A, Skeeles-Worley A. et al. Diffusing an innovation: clinician perceptions of continuous predictive analytics monitoring in intensive care. Appl Clin Inform 2019; 10 (02) 295-306

Reference Link Ris
Thieme Connect PubMed Search in Google Scholar
20 Ginestra JC, Giannini HM, Schweickert WD. et al. Clinician perception of a machine learning-based early warning system designed to predict severe sepsis and septic shock. Crit Care Med 2019; 47 (11) 1477-1484

Reference Link Ris
Crossref PubMed Search in Google Scholar
21 Masino AJ, Harris MC, Forsyth D. et al. Machine learning models for early sepsis recognition in the neonatal intensive care unit using readily available electronic health record data. PLoS One 2019; 14 (02) e0212665

Reference Link Ris
Crossref PubMed Search in Google Scholar
22 Cao L, Masina A, Unger L. et al. From Single Predictor to Two-Stage Screening Pipeline: Advancing NICU Sepsis Prediction for Improved Clinical Outcomes. 2023

Reference Link Ris
PubMed Search in Google Scholar
23 Karavite DJ, Harris MC, Grundmeier RW, Srinivasan L, Shaeffer GP, Muthu N. Using a sociotechnical model to understand challenges with sepsis recognition among critically ill infants. ACI Open 2022; 06 (02) e57-e65

Reference Link Ris
Thieme Connect PubMed Search in Google Scholar
24 Russ AL, Saleem JJ. Ten factors to consider when developing usability scenarios and tasks for health information technology. J Biomed Inform 2018; 78: 123-133

Reference Link Ris
Crossref PubMed Search in Google Scholar
25 Barda AJ, Horvat CM, Hochheiser H. A qualitative research framework for the design of user-centered displays of explanations for machine learning model predictions in healthcare. BMC Med Inform Decis Mak 2020; 20 (01) 257

Reference Link Ris
Crossref PubMed Search in Google Scholar
26 Norrie C. Explainable AI Techniques for Sepsis Diagnosis: Evaluating LIME and SHAP through a User Study; 2021. Accessed April 19, 2024 at: https://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-19845

Reference Link Ris
PubMed
27 Bates DW, Auerbach A, Schulam P, Wright A, Saria S. Reporting and implementing interventions involving machine learning and artificial intelligence. Ann Intern Med 2020; 172 (11, Suppl): S137-S144

Reference Link Ris
Crossref PubMed Search in Google Scholar
28 Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 2019; 366 (6464) 447-453

Reference Link Ris
Crossref PubMed Search in Google Scholar
29 Combi C, Amico B, Bellazzi R. et al. A manifesto on explainability for artificial intelligence in medicine. Artif Intell Med 2022; 133: 102423

Reference Link Ris
Crossref PubMed Search in Google Scholar
30 Langer M, Oster D, Speith T. et al. What do we want from explainable artificial intelligence (XAI)?–A stakeholder perspective on XAI and a conceptual model guiding interdisciplinary XAI research. Artif Intell 2021; 296: 103473

Reference Link Ris
Crossref PubMed Search in Google Scholar
31 Capel T, Brereton M. What is human-centered about human-centered AI? A map of the research landscape. In: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. ACM; 2023: 1-23

Reference Link Ris
PubMed Search in Google Scholar
32 Shneiderman B. Human-centered artificial intelligence: reliable, safe & trustworthy. Int J Hum Comput Interact 2020; 36 (06) 495-504

Reference Link Ris
Crossref PubMed Search in Google Scholar
33 Wang L, Zhang Z, Wang D. et al. Human-centered design and evaluation of AI-empowered clinical decision support systems: a systematic review. Front Comput Sci 2023; 5: 1187299

Reference Link Ris
Crossref PubMed Search in Google Scholar
34 Wright MC, Borbolla D, Waller RG. et al. Critical care information display approaches and design frameworks: a systematic review and meta-analysis. J Biomed Inform X 2019; 3: 100041

Reference Link Ris
Crossref PubMed Search in Google Scholar
35 Wysocki O, Davies JK, Vigo M. et al. Assessing the communication gap between AI models and healthcare professionals: explainability, utility and trust in AI-driven clinical decision-making. Artif Intell 2023; 316: 103839

Reference Link Ris
Crossref PubMed Search in Google Scholar
36 Albahri AS, Duhaim AM, Fadhel MA. et al. A systematic review of trustworthy and explainable artificial intelligence in healthcare: assessment of quality, bias risk, and data fusion. Inf Fusion 2023; 96: 156-191

Reference Link Ris
Crossref PubMed Search in Google Scholar
37 Fiebrink R, Gillies M. Introduction to the special issue on human-centered machine learning. ACM Trans Interact Intell Syst 2018; 8 (02) 1-7

Reference Link Ris
Crossref PubMed Search in Google Scholar
38 Sacha D, Sedlmair M, Zhang L. et al. What you see is what you can change: human-centered machine learning by interactive visualization. Neurocomputing 2017; 268: 164-175

Reference Link Ris
Crossref PubMed Search in Google Scholar
39 Dudley JJ, Kristensson PO. A review of user interface design for interactive machine learning. ACM Trans Interact Intell Syst 2018; 8 (02) 1-37

Reference Link Ris
Crossref PubMed Search in Google Scholar
40 Sperrle F, El-Assady M, Guo G. et al. A survey of human-centered evaluations in human-centered machine learning. Comput Graph Forum 2021; 40 (03) 543-568

Reference Link Ris
Crossref PubMed Search in Google Scholar
41 Hannon D, Rantanen E, Sawyer B. et al. A human factors engineering education perspective on data science, machine learning and automation. Proc Hum Factors Ergon Soc Annu Meet 2019; 63 (01) 488-492

Reference Link Ris
Crossref PubMed Search in Google Scholar

Related Journals

Subscribe to RSS

Share / Bookmark

Refining a Machine Learning Model for Predicting Infant Sepsis: A Multidisciplinary Team Supported by Human-Centered Design Methods

Authors

Abstract

Background

Objectives

Methods

Results

Conclusion

Keywords

Protection of Human and Animal Subjects

Publication History

References