Tapping the Pool of Non-Medical Images for Enhanced AI-Based Chest Radiography Analysis

S Tayebi Arasteh; C Kuhl; S Nebelung; D Truhn

doi:10.1055/s-0044-1781552

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00000066.xml

Rofo 2024; 196(S 01): S29
DOI: 10.1055/s-0044-1781552

Abstracts

Vortrag (Wissenschaft)

IT/Bildverarbeitung/Software

Tapping the Pool of Non-Medical Images for Enhanced AI-Based Chest Radiography Analysis

Authors

S Tayebi Arasteh

¹Uniklinik RWTH Aachen, Diagnostische und Interventionelle Radio, Aachen
C Kuhl

²Uniklinik RWTH Aachen, Diagnostische und Interventionelle Radiologie, Aachen
S Nebelung

²Uniklinik RWTH Aachen, Diagnostische und Interventionelle Radiologie, Aachen
D Truhn

²Uniklinik RWTH Aachen, Diagnostische und Interventionelle Radiologie, Aachen

Further Information

Also available at

Zielsetzung Traditionally, labeled datasets are used to develop artificial intelligence (AI)-based models in radiology. Self-supervised learning (SSL) utilizes unlabeled data for model development and may principally resort to non-medical images. Herein, we studied the potential of SSL using non-medical images for developing an AI model that interprets chest radiographs (CXR), contrasting the model’s performance with supervised learning (SL) requiring labeled data.

Material und Methoden Over 800,000 chest radiographs from open-access datasets (VinDr-CXR [n=18,000 CXRs], ChestX-ray14 [n=112,120], CheXpert [n=157,878], MIMIC-CXR [n=213,921]), PadChest [n=110,525]) and our institution (UKA [n=193,361]) targeting more than 20 labeled imaging findings were included. Three pre-training strategies were assessed: i) SSL using unlabeled non-medical (natural) images (n=142 million images), ii) SL using labeled natural images (n=14 million images with more than 21,000 distinct labels), and iii) SL with task-specifically labeled CXRs from the MIMIC-CXR dataset, the largest specific CXR dataset available (n=213,921 CXRs with 14 imaging findings). Diagnostic performance was compared using the AUC of held-out test sets (n=3,000, n=25,596, n=39,824, n=43,768, n=22,045, and n=39,824) and bootstrapping.

Ergebnisse SSL pre-training using non-medical images consistently outperformed the SL pre-training on the ImageNet dataset across all datasets (p<0.001 for all comparisons). Average AUC (%, mean ± standard deviation) over all datasets was 83.3 ± 5.4 with a minimum and maximum of 79.1 ± 6.3 and 89.7 ± 3.3. Remarkably, in larger datasets such as CheXpert and UKA, pre-training strategy (i) even surpassed pre-training strategy (iii) (p<0.001).

Schlussfolgerungen Tapping the large pool of non-medical images for the SSL-based development of AI models for medical image analyses represents a transformative approach to enhance efficiency and accuracy, especially when specifically labeled datasets are unavailable.

Publication History

Article published online:
12 April 2024

Georg Thieme Verlag
Rüdigerstraße 14, 70469 Stuttgart, Germany

Related Books

Subscribe to RSS

Share / Bookmark

Tapping the Pool of Non-Medical Images for Enhanced AI-Based Chest Radiography Analysis

Authors

Publication History