Endoscopy
DOI: 10.1055/a-2780-0974
Original article

Development and validation of a multimodal deep learning model for early esophageal squamous neoplasia detection and invasion depth prediction

Authors

  • Chu-Ting Yu

    1   Gastroenterology, Changhai Hospital, Shanghai, China (Ringgold ID: RIN12520)
    2   National Clinical Research Center for Digestive Diseases (Shanghai), Shanghai, China
  • Ting-Lu Wang

    1   Gastroenterology, Changhai Hospital, Shanghai, China (Ringgold ID: RIN12520)
    2   National Clinical Research Center for Digestive Diseases (Shanghai), Shanghai, China
  • Ye Gao

    1   Gastroenterology, Changhai Hospital, Shanghai, China (Ringgold ID: RIN12520)
    2   National Clinical Research Center for Digestive Diseases (Shanghai), Shanghai, China
  • Zhi-Han Wu

    3   Gastroenterology, West China Hospital of Sichuan University, Chengdu, China (Ringgold ID: RIN34753)
  • Ying-Zhou Chen

    3   Gastroenterology, West China Hospital of Sichuan University, Chengdu, China (Ringgold ID: RIN34753)
  • Lei Shi

    4   Gastroenterology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China (Ringgold ID: RIN34708)
  • Biao Liu

    5   Gastroenterology, The First Affiliated Hospital of Henan University of Science and Technology, Luoyang, China (Ringgold ID: RIN159366)
  • Hui Zhang

    6   Gastroenterology, First Affiliated Hospital of Shihezi University School of Medicine, Shihezi, China (Ringgold ID: RIN604058)
  • Hong-Wei Xu

    4   Gastroenterology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China (Ringgold ID: RIN34708)
  • Wei-Gang Chen

    6   Gastroenterology, First Affiliated Hospital of Shihezi University School of Medicine, Shihezi, China (Ringgold ID: RIN604058)
  • She-Gan Gao

    5   Gastroenterology, The First Affiliated Hospital of Henan University of Science and Technology, Luoyang, China (Ringgold ID: RIN159366)
  • Jin-Lin Yang

    3   Gastroenterology, West China Hospital of Sichuan University, Chengdu, China (Ringgold ID: RIN34753)
  • Luo-Wei Wang

    1   Gastroenterology, Changhai Hospital, Shanghai, China (Ringgold ID: RIN12520)
    2   National Clinical Research Center for Digestive Diseases (Shanghai), Shanghai, China
  • Lin Han

    1   Gastroenterology, Changhai Hospital, Shanghai, China (Ringgold ID: RIN12520)
    2   National Clinical Research Center for Digestive Diseases (Shanghai), Shanghai, China

Supported by: China University Industry-Research Innovation Fund - Huaton Medical Research Special Program 2023HT064
Supported by: Grants from ShanghaiMunicipal Health Commission Fund 202240357

Clinical Trial:

Registration number (trial ID): NCT06412419, Trial registry: Clinical Trials Registry India (http://www.ctri.nic.in/Clinicaltrials), Type of Study: Prospective and Retrospective Multicenter Study



Graphical Abstract

Abstract

Background

Early detection of esophageal squamous cell carcinoma (ESCC) is critical for optimizing patient outcomes. Magnifying endoscopy and endoscopic ultrasonography (EUS) serve as established diagnostic modalities. The multimodal ultrasound and magnifying endoscopic algorithm for early ESCC diagnostics (MUMA-EDx) integrates deep learning-based magnifying endoscopy and EUS imaging to improve early-stage ESCC identification and invasion depth assessment.

Methods

Model development and internal validation used a retrospective dataset; external validation used a prospective cohort. MUMA-EDx developed two TResNet_m-based classifiers (magnifying endoscopy/EUS) followed by feature-level fusion. Model performance was evaluated using area under the receiver operating characteristic curve (AUROC), accuracy, sensitivity, specificity, positive predictive value, and negative predictive value.

Results

MUMA-EDx was developed and validated using a retrospective dataset comprising 358 patients (18 420 images) and subsequently tested prospectively on an independent cohort of 122 patients (8711 images). The feature-level multimodal approach significantly outperformed single-modality models. For tumor discrimination, the model achieved an AUC of 0.94 (95%CI 0.92–0.96) in retrospective validation and a perfect patient-level AUC of 1.00 (95%CI 1.00–1.00) in prospective testing. For the more complex task of multiclass invasion depth classification, it achieved a retrospective AUC of 0.95 (95%CI 0.88–0.99), which remained strong at 0.80 (95%CI 0.67–0.87) in the prospective cohort. In a comparative study on invasion depth classification, MUMA-EDx's performance exceeded that of novice endoscopists and was comparable to expert-level diagnostics.

Conclusion

MUMA-EDx demonstrably delivers exceptional early ESCC detection and robust invasion depth classification, achieving performance comparable to expert endoscopists and is poised to significantly enhance diagnostic precision and patient outcomes.



Publication History

Received: 10 July 2025

Accepted after revision: 29 December 2025

Accepted Manuscript online:
30 December 2025

Article published online:
03 March 2026

© 2026. Thieme. All rights reserved.

Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany