Subscribe to RSS
DOI: 10.1055/s-0045-1805412
Artificial intelligence model assessing the cleanliness of the upper gastrointestinal tract using PEACE scale – the feasibility study
Authors
Aims Cleanliness of upper gastrointestinal tract (UGI) during endoscopy is an quality metric that has recently been quantified. The PEACE scale for cleanliness assessment has demonstrated a correlation with pathology detection rates [1]. A global reliability assessment of the PEACE scale showed good interobserver agreement of ICC 0.82 [2]. This study aimed to develop and perform an initial validation of an artificial intelligence (AI)-based system to assess cleanliness during esophagogastroduodenoscopy (EGD).
Methods A total of 381 EGD video samples collected between August 2021 and October 2024 were processed into 3–10 second video segments of the esophagus, stomach, and duodenum. Each video was scored from 0 to 3 on the PEACE scale by two independent raters. The dataset was divided into training (249 videos), validation (62 videos), and test (70 videos) datasets. A deep learning framework was created to classify videos into 12 classes based on location (esophagus, stomach, or duodenum) and cleanliness scores (0–3). The model employed DINOv2 for frame-level feature extraction, followed by an LSTM architecture to analyze temporal sequences of 30 frames with a stride of 15. Regularization techniques included feature-level dropout (0.6) and weighted cross-entropy loss to address class imbalance. The LSTM model had a two-layer structure with a hidden dimension of 256 and dense layers for classification. Training incorporated early stopping, learning rate scheduling, and a fine-tuning phase with reduced learning rates.
Results The model achieved tests’ accuracy for location and scoring within training dataset of 93.98% and 91.16% respectively. In the validation dataset accuracy was 90.32% for location and 75.81% for scoring. In the test dataset accuracy of the test for 12 classes ranged between 87.3% and 98.41%. It was 96.83%, 96.83%, 93.65%, 92.06% for esophagus (scores 0 – 3 respectively), 93.65%, 90.48%, 90.47%, 93.65% for stomach (scores 0 – 3 respectively), and 98.41%, 92.06%, 92.06%, 87.3% for duodenum (scores 0 – 3 respectively). The precision ranged between 40% and 80% and recall between 42.86% and 83.33%.
Conclusions This AI-based system demonstrated high accuracy in assessing cleanliness and location during EGD. These promising results warrant further validation in larger datasets and real-world clinical settings.
Publication History
Article published online:
27 March 2025
© 2025. European Society of Gastrointestinal Endoscopy. All rights reserved.
Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany
-
References
- 1 Romańczyk M, Desai M, Kamiński MF. et al. International validation of a novel PEACE scale to improve the quality of upper gastrointestinal mucosal inspection during endoscopy Clinical and Translational Gastroenterology ():10.14309/ctg.0000000000000786, November 14 2024; doi:10.14309/ctg.0000000000000786.
- 2 Romańczyk M, Ostrowski B, Lesińska M. et al. The prospective validation of a scoring system to assess mucosal cleanliness during EGD. Gastrointest Endosc 2024; 100 (01): 27-35
