Subscribe to RSS
DOI: 10.1055/a-2650-0789
Multicenter validation of a cholangioscopy artificial intelligence for evaluation of biliary tract disease
Supported by: UMass Memorial Health
Supported by: University of Massachusetts Medical School
Supported by: MassVentures
Supported by: Mayo Clinic

Introduction: Clinicians struggle to accurately classify biliary strictures as benign or malignant. Current ERCP-based sampling modalities including brush cytology and forceps biopsy have poor sensitivity for pathologic confirmation of malignancy. Cholangioscopy allows for direct visualization and sampling of biliary pathology; however, this technology is also associated with inaccurate classification of biliary disease. Previously, an artificial intelligence (AI) that analyzes cholangioscopy footage was found to be more accurate in diagnosing biliary malignancy than ERCP sampling techniques. The aim of this study was to validate this AI on a new series of examinations. Methods: Three academic centers collected all available, unedited cholangioscopy recordings. The videos were processed by the cholangioscopy AI. After analyzing videos, the AI provided predictions as to whether malignancy was present. AI performance in classifying strictures was compared to performance of brush cytology and forceps biopsy. Results: 112 cholangioscopy examinations (containing 4,817,081 images) were generated from 99 patients. Of those examinations, 61 (54.5%) were for investigation of biliary strictures (31 [50.8%] benign, 30 [49.2%] malignant). For the correct classification of strictures, the AI was 80.0% sensitive and 90.3% specific. The AI was also significantly more accurate for stricture classification (85.2%) than brush cytology (52.5%; p < 0.001), forceps biopsy (68.2%; p = 0.037), and the combination of brush cytology and forceps biopsy (66.7%; p = 0.022). Discussion: A previously developed cholangioscopy AI was found to continually outperform standard ERCP sampling modalities for accurate identification of malignancy without additional retraining in a multicenter validation cohort.
Publication History
Received: 18 February 2025
Accepted after revision: 06 July 2025
Accepted Manuscript online:
06 July 2025
© . Thieme. All rights reserved.
Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany