Methods Inf Med 2019; 58(01): 031-041
DOI: 10.1055/s-0039-1677692
Original Article
Georg Thieme Verlag KG Stuttgart · New York

Deep Learning versus Conventional Machine Learning for Detection of Healthcare-Associated Infections in French Clinical Narratives

Sara Rabhi
1  Telecom SudParis, Institut Mines-Telecom, Paris, Île-de-France, France
Jérémie Jakubowicz
1  Telecom SudParis, Institut Mines-Telecom, Paris, Île-de-France, France
Marie-Helene Metzger
2  INSERM U1018, Villejuif, France
3  Assistance Publique - Hôpitaux de Paris, Hôpital Antoine-Béclère, Clamart, France
4  Université Paris 13, UFR SMBH, Bobigny, France
› Institutsangaben
Funding This work was partly funded by the French National Research Agency, as part of its TECSAN program (ANR-08-TECS-001 and ANR-12-TECS-0006).
Weitere Informationen


02. Juli 2018

04. Dezember 2018

15. März 2019 (eFirst)


Objective The objective of this article was to compare the performances of health care-associated infection (HAI) detection between deep learning and conventional machine learning (ML) methods in French medical reports.

Methods The corpus consisted in different types of medical reports (discharge summaries, surgery reports, consultation reports, etc.). A total of 1,531 medical text documents were extracted and deidentified in three French university hospitals. Each of them was labeled as presence (1) or absence (0) of HAI. We started by normalizing the records using a list of preprocessing techniques. We calculated an overall performance metric, the F1 Score, to compare a deep learning method (convolutional neural network [CNN]) with the most popular conventional ML models (Bernoulli and multi-naïve Bayes, k-nearest neighbors, logistic regression, random forests, extra-trees, gradient boosting, support vector machines). We applied the hyperparameter Bayesian optimization for each model based on its HAI identification performances. We included the set of text representation as an additional hyperparameter for each model, using four different text representations (bag of words, term frequency–inverse document frequency, word2vec, and Glove).

Results CNN outperforms all other conventional ML algorithms for HAI classification. The best F1 Score of 97.7% ± 3.6% and best area under the curve score of 99.8% ± 0.41% were achieved when CNN was directly applied to the processed clinical notes without a pretrained word2vec embedding. Through receiver operating characteristic curve analysis, we could achieve a good balance between false notifications (with a specificity equal to 0.937) and system detection capability (with a sensitivity equal to 0.962) using the Youden's index reference.

Conclusions The main drawback of CNNs is their opacity. To address this issue, we investigated CNN inner layers' activation values to visualize the most meaningful phrases in a document. This method could be used to build a phrase-based medical assistant algorithm to help the infection control practitioner to select relevant medical records. Our study demonstrated that deep learning approach outperforms other classification learning algorithms for automatically identifying HAIs in medical reports.

Supplementary Material