Endoscopy
DOI: 10.1055/a-2650-0789
Original article

Multicenter validation of a cholangioscopy artificial intelligence system for the evaluation of biliary tract disease

1   Gastroenterology, UMass Chan Medical School, Worcester, United States (Ringgold ID: RIN12262)
2   Program in Digital Medicine, UMass Chan Medical School, Worcester, United States (Ringgold ID: RIN12262)
,
Patrick D. Powers
1   Gastroenterology, UMass Chan Medical School, Worcester, United States (Ringgold ID: RIN12262)
2   Program in Digital Medicine, UMass Chan Medical School, Worcester, United States (Ringgold ID: RIN12262)
,
Jad P. AbiMansour
3   Gastroenterology and Hepatology, Mayo Clinic, Rochester, United States (Ringgold ID: RIN6915)
,
Matthew Marcello
1   Gastroenterology, UMass Chan Medical School, Worcester, United States (Ringgold ID: RIN12262)
,
Nikhil R. Thiruvengadam
4   Gastroenterology, Loma Linda University, Loma Linda, United States (Ringgold ID: RIN4608)
,
Navine Nasser-Ghodsi
1   Gastroenterology, UMass Chan Medical School, Worcester, United States (Ringgold ID: RIN12262)
,
Prashanth Rau
1   Gastroenterology, UMass Chan Medical School, Worcester, United States (Ringgold ID: RIN12262)
,
Jaroslav Zivny
1   Gastroenterology, UMass Chan Medical School, Worcester, United States (Ringgold ID: RIN12262)
,
Savant Mehta
1   Gastroenterology, UMass Chan Medical School, Worcester, United States (Ringgold ID: RIN12262)
,
Christopher Marshall
1   Gastroenterology, UMass Chan Medical School, Worcester, United States (Ringgold ID: RIN12262)
,
Paul Leonor
4   Gastroenterology, Loma Linda University, Loma Linda, United States (Ringgold ID: RIN4608)
,
Kendrick Che
4   Gastroenterology, Loma Linda University, Loma Linda, United States (Ringgold ID: RIN4608)
,
Barham Abu Dayyeh
3   Gastroenterology and Hepatology, Mayo Clinic, Rochester, United States (Ringgold ID: RIN6915)
,
3   Gastroenterology and Hepatology, Mayo Clinic, Rochester, United States (Ringgold ID: RIN6915)
,
Bret T. Petersen
3   Gastroenterology and Hepatology, Mayo Clinic, Rochester, United States (Ringgold ID: RIN6915)
,
Ryan Law
3   Gastroenterology and Hepatology, Mayo Clinic, Rochester, United States (Ringgold ID: RIN6915)
,
John A. Martin
3   Gastroenterology and Hepatology, Mayo Clinic, Rochester, United States (Ringgold ID: RIN6915)
,
3   Gastroenterology and Hepatology, Mayo Clinic, Rochester, United States (Ringgold ID: RIN6915)
,
Vinay Chandrasekhara
3   Gastroenterology and Hepatology, Mayo Clinic, Rochester, United States (Ringgold ID: RIN6915)
› Author Affiliations

Supported by: UMass Memorial Health
Supported by: University of Massachusetts Medical School
Supported by: MassVentures
Supported by: Mayo Clinic
Preview

Abstract

Background

Clinicians struggle to accurately classify biliary strictures as benign or malignant. Current endoscopic retrograde cholangiopancreatography (ERCP)-based sampling modalities including brush cytology and forceps biopsy have poor sensitivity for pathologic confirmation of malignancy. Cholangioscopy allows for direct visualization and sampling of biliary pathology; however, this technology is also associated with inaccurate classification of biliary disease. Previously, an artificial intelligence (AI) system that analyzes cholangioscopy footage was found to be more accurate in diagnosing biliary malignancy than ERCP sampling techniques. The aim of this study was to validate this AI system on a new series of examinations.

Method

Three academic centers collected all available unedited cholangioscopy recordings. The videos were processed by the cholangioscopy AI system. After analyzing videos, the AI system provided predictions as to whether malignancy was present. AI performance in classifying strictures was compared with the performance of brush cytology and forceps biopsy.

Results

112 cholangioscopy examinations (containing 4 817 081 images) were generated from 99 patients. Of those examinations, 61 (54.5%) were for investigation of biliary strictures (31 [50.8%] benign, 30 [49.2%] malignant). For the correct classification of strictures, the AI system was 80.0% sensitive and 90.3% specific. It was also significantly more accurate for stricture classification (85.2%) than brush cytology (52.5%; P<0.001), forceps biopsy (68.2%; P=0.04), and the combination of brush cytology and forceps biopsy (66.7%; P=0.02).

Conclusion

A previously developed cholangioscopy AI system was found to continually outperform standard ERCP sampling modalities for accurate identification of malignancy, without additional retraining, in a multicenter validation cohort.

Supplementary Material



Publication History

Received: 18 February 2025

Accepted after revision: 06 July 2025

Accepted Manuscript online:
06 July 2025

Article published online:
18 August 2025

© 2025. Thieme. All rights reserved.

Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany