Subscribe to RSS
DOI: 10.1055/a-2675-5169
Closing the gap in endoscopy quality: is artificial intelligence the answer?
Referring to Chen S et al. doi: 10.1055/a-2626-0069
Upper gastrointestinal (UGI) cancers remain a major global health burden. Gastric adenocarcinoma is the fifth most common malignancy worldwide, with 5-year survival rates of 20%–40%, largely due to late-stage diagnosis [1]. To combat this, measures to improve early detection are required. Despite advances in endoscopic imaging and established auditable quality standards for esophagogastroduodenoscopy (EGD) [2], significant variations in practice and outcomes still exist.
In this issue of Endoscopy, Chen et al. report the results of a multicenter randomized controlled trial evaluating the impact of a novel artificial intelligence (AI) system – the endoscopy quality control assistant (EQCA) – on real-time EGD quality and lesion detection [3]. Cancer-related lesions were defined as low grade and high grade intraepithelial neoplasia and cancer. Over 32 000 patients across seven centers in the Zhejiang province of China were randomly assigned to undergo AI-assisted or standard (control) EGD, which consisted of an examination under white light, with virtual chromoendoscopy employed to interrogate abnormalities at the discretion of the endoscopist.
“If AI performance metrics become the new gold standard, a clear framework for human oversight, regulation, and medicolegal implications must be agreed upon.”
The current main application of AI systems in gastrointestinal endoscopy is in lesion detection and characterization. A small number of published trials have shown the utility of AI in monitoring inspection time, assessing completeness of mucosal inspection, and reducing blind spots [4] [5]. The novelty of the EQCA lies in its ability to assess videos rather than still images, by combining traditional deep convolution neural networks with long short-term memory systems, enabling more accurate identification of anatomical landmarks. The UGI tract was divided into 31 anatomical sites, each assigned a part score based on frame quality and completeness of views obtained. A composite operation score across all sites provided a real-time measure of EGD quality. In the AI-assisted group, significantly higher detection rates for all cancer-related lesions were achieved compared with controls (8.00% vs. 5.55%, P < 0.001).
Of note, this effect was only sustained in the cardia/fundus and body of the stomach, likely reflecting the increased difficulty of adequately examining these areas with notorious blind spots, rugal folds, and gravity-related pooling of fluid residue.
The AI-assisted group demonstrated higher operation scores, longer inspection times, and greater biopsy rates. Lesion detection was positively correlated with operation scores and inspection time in both the AI-assisted and control groups. Crucially, despite involving only experienced endoscopists (>5000 EGDs), high interoperator variability was observed, and key performance indicators were often unmet. Societal guidelines recommend ≥7 minutes for EGD, yet the control group averaged only 5.6 minutes [2]. These results reaffirm long-standing principles: thorough, systematic inspection improves lesion detection. One may argue that AI is not required, rather upskilling to perform to a higher standard, as outlined in the guidelines.
However, the realities of real-world practice suggest influencing individual behavior is not that simple. Variation in practice and poor clinical outcomes are echoed globally. Postendoscopy UGI cancer (PEUGIC), a recognized surrogate marker of endoscopy quality, occurs in 6%–11% of UGI cancers [6]. In a national UK audit, PEUGIC rates varied from 5% to 13% across providers, reflecting real-world variation in quality [6]. In a root cause analysis, inadequate assessment of premalignant or focal lesions and inadequate endoscopy quality were identified as the most common causes of PEUGIC [7]. Notably, in the Chen et al. study, EQCA not only improved average performance but also reduced variability across centers, supporting its potential as a tool for standardization.
Over the past decade, the rate of PEUGIC has increased. Reducing variation in endoscopy quality is therefore a matter of clinical urgency [6]. Implementing and maintaining quality assurance in endoscopy units through regular auditing is effective, but incredibly time and resource consuming, which impacts adherence. The retrospective nature of this process results in delays in improving patient outcomes. A strategy that provides real-time feedback and reduces interoperator variability in quality is required [8].
Nonetheless, limitations remain. The study lacks data on Helicobacter pylori and smoking status, family history of UGI neoplasia, and presence of chronic atrophic gastritis, which are key variables for interpreting cancer lesion detection rates. EQCA was trained only in white-light imaging and developed in a high-prevalence setting for gastric cancer. The training and implementation in Chinese centers, where the prevalence of Barrett’s esophagus is low, may account for the low impact of EQCA in the esophagus. There was no comparison between operators of different experience levels. While the AI system enforced systematic inspection of 31 anatomical sites, it remains unclear whether a standardized mucosal visualization protocol was followed in the control group, limiting interpretability of comparative completeness scores. Broader validation is needed, particularly in Western cohorts and training settings. Nevertheless, this trial represents a step forward in shifting AI from diagnostic support to real-time, automated quality assurance.
As AI tools mature, feasibility is no longer the barrier – implementation is. There must be transparency regarding the inputs used for training and validation of AI systems. Detailed patient characteristics must be provided as it is imperative they reflect the patient population in clinical practice. AI must be benchmarked against objective, auditable standards and externally validated. In this study, image labeling was performed by experts (defined as 5–10 years’ EGD experience), yet the findings demonstrate that experience alone does not guarantee expertise. If AI performance metrics become the new gold standard, a clear framework for human oversight, regulation, and medicolegal implications must be agreed upon. Finally, will systems like EQCA be integrated into training, deployed to support underperforming endoscopists/community settings, or embedded across all units?
Several questions remain before AI can be implemented into routine care. However, if we are serious about closing the quality gap in endoscopy, then AI-driven standardization may well be part of the answer.
Publication History
Received: 17 June 2025
Accepted after revision: 21 July 2025
Article published online:
18 August 2025
© 2025. Thieme. All rights reserved.
Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany
-
References
- 1 Global Cancer Observatory. Accessed June 11, 2025 at: https://gco.iarc.fr/survival/survcan/dataviz/table?multiple_populations=0&sort_by=value1&survival=5&cancers=70
- 2 Bisschops R, Areia M, Coron E. et al. Performance measures for upper gastrointestinal endoscopy: a European Society of Gastrointestinal Endoscopy (ESGE) Quality Improvement Initiative. Endoscopy 2016; 48: 843-864
- 3 Chen S, Li Y, Wang H. et al. A novel artificial intelligence-based system for quality monitoring during esophagogastroduodenoscopy: a multicenter randomized controlled study. Endoscopy 2025; 57
- 4 Chen D, Wu L, Li Y. et al. Comparing blind spots of unsedated ultrafine, sedated, and unsedated conventional gastroscopy with and without artificial intelligence: a prospective, single-blind, 3-parallel-group, randomized, single-center trial. Gastrointest Endosc 2020; 91: 332-339.e3
- 5 Wu L, He X, Liu M. et al. Evaluation of the effects of an artificial intelligence system on endoscopy quality and preliminary testing of its performance in detecting early gastric cancer: a randomized controlled trial. Endoscopy 2021; 53: 1199-1207
- 6 Kamran U, Evison F, Morris EJA. et al. The variation in post-endoscopy upper gastrointestinal cancer rates among endoscopy providers in England and associated factors: a population-based study. Endoscopy 2025; 57: 17-28
- 7 Kamran U, King D, Abbasi A. et al. A root cause analysis system to establish the most plausible explanation for post-endoscopy upper gastrointestinal cancer. Endoscopy 2023; 55: 109-118
- 8 Messmann H, Bisschops R, Antonelli G. et al. Expected value of artificial intelligence in gastrointestinal endoscopy: European Society of Gastrointestinal Endoscopy (ESGE) Position Statement. Endoscopy 2022; 54: 1211-1231