Advancing Video Capsule Endoscopy with Edge AI: A Public Multi-Center Capsule Dataset with Multi-Label Annotations

M Le Floch; J Werner; L Mcintyre; F Wolf; J Steinhäuser; J Hampe; R Langanke; S Kirk; C Stopp; M E Geissler; F Brinkmann; N Herzog

doi:10.1055/s-0045-1805505

Endoscopy, Table of Contents

Endoscopy 2025; 57(S 02): S205
DOI: 10.1055/s-0045-1805505

Abstracts | ESGE Days 2025

Moderated poster

Diagnosis and performance in endoscopy: an update 03/04/2025, 16:00 – 17:00 Poster Dome 1 (P0)

Advancing Video Capsule Endoscopy with Edge AI: A Public Multi-Center Capsule Dataset with Multi-Label Annotations

Authors

M Le Floch

¹Germany, Dresden, Else Kröner Fresenius Zentrum für Digitale Gesundheit, Dresden, Germany

²UKD | University Hospital Dresden Carl Gustav Carus, Departement of Medicine I, Dresden, Germany
J Werner

³Department of Computer Science, University of Tübingen, Tübingen, Germany
L Mcintyre

⁴Institute of Computer Science, TUD Dresden University of Technology, Dresden, Germany
F Wolf

⁴Institute of Computer Science, TUD Dresden University of Technology, Dresden, Germany
J Steinhäuser

²UKD | University Hospital Dresden Carl Gustav Carus, Departement of Medicine I, Dresden, Germany
J Hampe

²UKD | University Hospital Dresden Carl Gustav Carus, Departement of Medicine I, Dresden, Germany
R Langanke

²UKD | University Hospital Dresden Carl Gustav Carus, Departement of Medicine I, Dresden, Germany
S Kirk

²UKD | University Hospital Dresden Carl Gustav Carus, Departement of Medicine I, Dresden, Germany
C Stopp

¹Germany, Dresden, Else Kröner Fresenius Zentrum für Digitale Gesundheit, Dresden, Germany
M E Geissler

²UKD | University Hospital Dresden Carl Gustav Carus, Departement of Medicine I, Dresden, Germany
F Brinkmann

²UKD | University Hospital Dresden Carl Gustav Carus, Departement of Medicine I, Dresden, Germany
N Herzog

²UKD | University Hospital Dresden Carl Gustav Carus, Departement of Medicine I, Dresden, Germany

Abstract

Full Text

Aims Video capsule endoscopy is a minimally invasive tool for examining the small intestine, though its adoption is limited due to intensive manual interpretation and battery constraints. This study seeks to improve capsule endoscopy efficiency through edge Artificial Intelligence (AI), which processes data on the device itself, enabling real-time diagnostics at the point of care. The Galar dataset — a comprehensive, multi-center, multi-label video capsule dataset — addresses these limitations in automated detection and anatomical localization, bridging a gap as no edge AI solutions currently exist for capsule endoscopy. This project facilitates broader clinical use and potential improvements in patient outcomes.

Methods The Galar dataset, compiled across Germany, contains over 3.5 million frames from 80 capsule videos annotated for technical, anatomical, and pathological features. Annotations by five expert annotators used the Computer Vision Annotation Tool (CVAT), with accuracy validated via ResNet-50 model training. Based on our dataset, our collaborative research focuses on developing edge AI approaches that improve model performance. First, Convolutional Neural Networks (CNNs) were combined with Hidden Markov Models (HMMs) for time-series analysis, enabling accurate localization within the gastrointestinal tract while employing a low-parameter model (approx. 1 million parameters) fit for low-power devices. Additionally, ensemble models, combining image classifiers and autoencoders, optimized detection accuracy while minimizing computational load.

Results The Galar dataset stands as one of the most comprehensive resources in capsule endoscopy, with 29 anatomical and pathological labels essential for gastrointestinal diagnostics. The collaborative edge AI approach, integrating CNNs with HMMs, achieved a 93% accuracy on the dataset, demonstrating precise anatomical localization. Ensemble models also reached high accuracy in distinguishing features within the dataset, achieving an AUC score of 76% for anomaly detection. These low-complexity models align with capsule endoscopy's energy constraints, enabling real-time, autonomous evaluations on resource-limited devices. Comparatively, this edge AI model achieves similar, if not improved, outcomes with fewer resources than traditional CNN models.

Conclusions As one of the largest annotated public datasets for video capsule endoscopy, the Galar dataset is a valuable resource for AI advancements in gastrointestinal imaging. Findings indicate that low-parameter, energy-efficient models enabled by edge AI can enhance diagnostic efficiency in capsule endoscopy, lessen clinician workload, and enable real-time anomaly detection directly on the capsule. These models have the potential to perform preliminary analyses on-device, sending only relevant images or real-time capsule localization and conserving battery life. This advancement positions video capsule endoscopy as a more efficient, accessible diagnostic tool.