Appl Clin Inform 2016; 07(01): 101-115
DOI: 10.4338/ACI-2015-09-RA-0114
Research Article
Schattauer GmbH

Natural Language Processing for Cohort Discovery in a Discharge Prediction Model for the Neonatal ICU

Michael W. Temple
1   Department of Biomedical Informatics Vanderbilt University, Nashville, TN
,
Christoph U. Lehmann
1   Department of Biomedical Informatics Vanderbilt University, Nashville, TN
2   Department of Pediatrics Vanderbilt University, Nashville, TN
,
Daniel Fabbri
1   Department of Biomedical Informatics Vanderbilt University, Nashville, TN
› Institutsangaben
National Library of Medicine Training Grant 5T15LM007450-13.
Weitere Informationen

Correspondence to:

Michael Temple
Department of Biomedical Informatics
Vanderbilt University School of Medicine
2525 West End
Suite 1475
Nashville
TN 37203–8390
eMail: mtemple1@me.com   
Telefon: 615-936-1068

Publikationsverlauf

received: 12. September 2015

accepted: 02. Januar 2016

Publikationsdatum:
16. Dezember 2017 (online)

 

Summary

Objectives

Discharging patients from the Neonatal Intensive Care Unit (NICU) can be delayed for non-medical reasons including the procurement of home medical equipment, parental education, and the need for children’s services. We previously created a model to identify patients that will be medically ready for discharge in the subsequent 2–10 days. In this study we use Natural Language Processing to improve upon that model and discern why the model performed poorly on certain patients.

Methods

We retrospectively examined the text of the Assessment and Plan section from daily progress notes of 4,693 patients (103,206 patient-days) from the NICU of a large, academic children’s hospital. A matrix was constructed using words from NICU notes (single words and bigrams) to train a supervised machine learning algorithm to determine the most important words differentiating poorly performing patients compared to well performing patients in our original discharge prediction model.

Results

NLP using a bag of words (BOW) analysis revealed several cohorts that performed poorly in our original model. These included patients with surgical diagnoses, pulmonary hypertension, retinopathy of prematurity, and psychosocial issues.

Discussion

The BOW approach aided in cohort discovery and will allow further refinement of our original discharge model prediction. Adequately identifying patients discharged home on g-tube feeds alone could improve the AUC of our original model by 0.02. Additionally, this approach identified social issues as a major cause for delayed discharge.

Conclusion

A BOW analysis provides a method to improve and refine our NICU discharge prediction model and could potentially avoid over 900 (0.9%) hospital days.

Abbreviations

AUC – Area under the Curve, CART -- Classification And Regression Trees, DTD – Days to Dis- charge, GI – Gastrointestinal, LOS – Length of Stay, NICU – Neonatal Intensive Care Unit, NS – Neurosurgery, RF – Random Forest.


#

 


#

Conflict of Interest

The authors have no conflicts of interest to disclose.

  • References

  • 1 Bockli K, Andrews B, Pellerite M, Meadow W. Trends and challenges in United States neonatal intensive care units follow-up clinics. Journal of perinatology : official journal of the California Perinatal Association 2014; 34 (01) 71-74.
  • 2 Challis D, Hughes J, Xie C, Jolley D. An examination of factors influencing delayed discharge of older people from hospital. International journal of geriatric psychiatry 2014; 29 (02) 160-168.
  • 3 Temple MW, Lehmann CU, Fabbri D. Predicting Discharge Dates From the NICU Using Progress Note Data. Pediatrics 2015; 136 (02) e395-405.
  • 4 Manktelow BN, Seaton SE, Field DJ, Draper ES. Population-based estimates of in-unit survival for very preterm infants. Pediatrics 2013; 131 (02) e425-e432.
  • 5 Draper ES, Manktelow B, Field DJ, James D. Prediction of survival for preterm births by weight and gestational age: retrospective population based study. Bmj 1999; 319 7217 1093-1097.
  • 6 Hintz SR, Bann CM, Ambalavanan N, Cotten CM, Das A, Higgins RD. et al. Predicting time to hospital discharge for extremely preterm infants. Pediatrics 2010; 125 (01) e146-e154.
  • 7 Yang H, Spasic I, Keane JA, Nenadic G. A text mining approach to the prediction of disease status from clinical discharge summaries. Journal of the American Medical Informatics Association: JAMIA 2009; 16 (04) 596-600.
  • 8 Yang H. Automatic extraction of medication information from medical discharge summaries. Journal of the American Medical Informatics Association: JAMIA 2010; 17 (05) 545-548.
  • 9 Jiang M, Chen Y, Liu M, Rosenbloom ST, Mani S, Denny JC. et al. A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. Journal of the American Medical Informatics Association: JAMIA 2011; 18 (05) 601-606.
  • 10 Wright A, McCoy AB, Henkin S, Kale A, Sittig DF. Use of a support vector machine for categorizing freetext notes: assessment of accuracy across two institutions. Journal of the American Medical Informatics Association: JAMIA 2013; 20 (05) 887-890.
  • 11 Cui L, Bozorgi A, Lhatoo SD, Zhang GQ, Sahoo SS. EpiDEA: extracting structured epilepsy and seizure information from patient discharge summaries for cohort identification. AMIA Annual Symposium pro ceedings / AMIA Symposium AMIA Symposium 2012; 2012: 1191-1200.
  • 12 Bejan CA, Vanderwende L, Evans HL, Wurfel MM, Yetisgen-Yildiz M. On-time clinical phenotype prediction based on narrative reports. AMIA Annual Symposium proceedings/AMIA Symposium AMIA Symposium 2013; 2013: 103-110.
  • 13 Wu ST, Juhn YJ, Sohn S, Liu H. Patient-level temporal aggregation for text-based asthma status ascertainment. Journal of the American Medical Informatics Association: JAMIA 2014; 21 (05) 876-884.
  • 14 Ludvigsson JF, Pathak J, Murphy S, Durski M, Kirsch PS, Chute CG. et al. Use of computerized algorithm to identify individuals in need of testing for celiac disease. Journal of the American Medical Informatics Association JAMIA 2013; 20 e2 e306-e310.
  • 15 Connolly B, Matykiewicz P, Bretonnel KCohen, Standridge SM, Glauser TA, Dlugos DJ. et al. Assessing the similarity of surface linguistic features related to epilepsy across pediatric hospitals. Journal of the American Medical Informatics Association: JAMIA 2014; 21 (05) 866-870.
  • 16 Danciu I, Cowan JD, Basford M, Wang X, Saip A, Osgood S. et al. Secondary use of clinical data: The Vanderbilt approach. Journal of biomedical informatics 2014; 52 (00) 28-35.
  • 17 http://www.nltk.org
  • 18 http://scikit-learn.org/stable/index.html
  • 19 Wang J, Du L, Cai W, Pan W, Yan W. Prolonged feeding difficulties after surgical correction of intestinal atresia: a 13-year experience. Journal of pediatric surgery 2014; 49 (11) 1593-1597.
  • 20 Garg R, Agthe AG, Donohue PK, Lehmann CU. Hyperglycemia and retinopathy of prematurity in very low birth weight infants. Journal of perinatology: official journal of the California Perinatal Association 2003; 23 (03) 186-194.
  • 21 Chavez-Valdez R, McGowan J, Cannon E, Lehmann CU. Contribution of early glycemic status in the development of severe retinopathy of prematurity in a cohort of ELBW infants. Journal of perinatology: official journal of the California Perinatal Association 2011; 31 (12) 749-756.
  • 22 Chapman WW, Nadkarni PM, Hirschman L, D’Avolio LW, Savova GK, Uzuner O. Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions. 2011; 2011-09–01 00:00:00 540-3.

Correspondence to:

Michael Temple
Department of Biomedical Informatics
Vanderbilt University School of Medicine
2525 West End
Suite 1475
Nashville
TN 37203–8390
eMail: mtemple1@me.com   
Telefon: 615-936-1068

  • References

  • 1 Bockli K, Andrews B, Pellerite M, Meadow W. Trends and challenges in United States neonatal intensive care units follow-up clinics. Journal of perinatology : official journal of the California Perinatal Association 2014; 34 (01) 71-74.
  • 2 Challis D, Hughes J, Xie C, Jolley D. An examination of factors influencing delayed discharge of older people from hospital. International journal of geriatric psychiatry 2014; 29 (02) 160-168.
  • 3 Temple MW, Lehmann CU, Fabbri D. Predicting Discharge Dates From the NICU Using Progress Note Data. Pediatrics 2015; 136 (02) e395-405.
  • 4 Manktelow BN, Seaton SE, Field DJ, Draper ES. Population-based estimates of in-unit survival for very preterm infants. Pediatrics 2013; 131 (02) e425-e432.
  • 5 Draper ES, Manktelow B, Field DJ, James D. Prediction of survival for preterm births by weight and gestational age: retrospective population based study. Bmj 1999; 319 7217 1093-1097.
  • 6 Hintz SR, Bann CM, Ambalavanan N, Cotten CM, Das A, Higgins RD. et al. Predicting time to hospital discharge for extremely preterm infants. Pediatrics 2010; 125 (01) e146-e154.
  • 7 Yang H, Spasic I, Keane JA, Nenadic G. A text mining approach to the prediction of disease status from clinical discharge summaries. Journal of the American Medical Informatics Association: JAMIA 2009; 16 (04) 596-600.
  • 8 Yang H. Automatic extraction of medication information from medical discharge summaries. Journal of the American Medical Informatics Association: JAMIA 2010; 17 (05) 545-548.
  • 9 Jiang M, Chen Y, Liu M, Rosenbloom ST, Mani S, Denny JC. et al. A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. Journal of the American Medical Informatics Association: JAMIA 2011; 18 (05) 601-606.
  • 10 Wright A, McCoy AB, Henkin S, Kale A, Sittig DF. Use of a support vector machine for categorizing freetext notes: assessment of accuracy across two institutions. Journal of the American Medical Informatics Association: JAMIA 2013; 20 (05) 887-890.
  • 11 Cui L, Bozorgi A, Lhatoo SD, Zhang GQ, Sahoo SS. EpiDEA: extracting structured epilepsy and seizure information from patient discharge summaries for cohort identification. AMIA Annual Symposium pro ceedings / AMIA Symposium AMIA Symposium 2012; 2012: 1191-1200.
  • 12 Bejan CA, Vanderwende L, Evans HL, Wurfel MM, Yetisgen-Yildiz M. On-time clinical phenotype prediction based on narrative reports. AMIA Annual Symposium proceedings/AMIA Symposium AMIA Symposium 2013; 2013: 103-110.
  • 13 Wu ST, Juhn YJ, Sohn S, Liu H. Patient-level temporal aggregation for text-based asthma status ascertainment. Journal of the American Medical Informatics Association: JAMIA 2014; 21 (05) 876-884.
  • 14 Ludvigsson JF, Pathak J, Murphy S, Durski M, Kirsch PS, Chute CG. et al. Use of computerized algorithm to identify individuals in need of testing for celiac disease. Journal of the American Medical Informatics Association JAMIA 2013; 20 e2 e306-e310.
  • 15 Connolly B, Matykiewicz P, Bretonnel KCohen, Standridge SM, Glauser TA, Dlugos DJ. et al. Assessing the similarity of surface linguistic features related to epilepsy across pediatric hospitals. Journal of the American Medical Informatics Association: JAMIA 2014; 21 (05) 866-870.
  • 16 Danciu I, Cowan JD, Basford M, Wang X, Saip A, Osgood S. et al. Secondary use of clinical data: The Vanderbilt approach. Journal of biomedical informatics 2014; 52 (00) 28-35.
  • 17 http://www.nltk.org
  • 18 http://scikit-learn.org/stable/index.html
  • 19 Wang J, Du L, Cai W, Pan W, Yan W. Prolonged feeding difficulties after surgical correction of intestinal atresia: a 13-year experience. Journal of pediatric surgery 2014; 49 (11) 1593-1597.
  • 20 Garg R, Agthe AG, Donohue PK, Lehmann CU. Hyperglycemia and retinopathy of prematurity in very low birth weight infants. Journal of perinatology: official journal of the California Perinatal Association 2003; 23 (03) 186-194.
  • 21 Chavez-Valdez R, McGowan J, Cannon E, Lehmann CU. Contribution of early glycemic status in the development of severe retinopathy of prematurity in a cohort of ELBW infants. Journal of perinatology: official journal of the California Perinatal Association 2011; 31 (12) 749-756.
  • 22 Chapman WW, Nadkarni PM, Hirschman L, D’Avolio LW, Savova GK, Uzuner O. Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions. 2011; 2011-09–01 00:00:00 540-3.