Künstliche Intelligenz zur Indikationsstellung einer invasiven Mikrokalkabklärung im Mammografie-Screening

Stefanie Weigel; Anne-Kathrin Brehl; Walter Heindel; Laura Kerschke

doi:10.1055/a-2143-1428

Senologie - Zeitschrift für Mammadiagnostik und -therapie, Table of Contents

Senologie - Zeitschrift für Mammadiagnostik und -therapie 2023; 20(03): 216-224
DOI: 10.1055/a-2143-1428

Wissenschaftliche Arbeit

Künstliche Intelligenz zur Indikationsstellung einer invasiven Mikrokalkabklärung im Mammografie-Screening

Artificial Intelligence for Indication of Invasive Assessment of Calcifications in Mammography Screening

Stefanie Weigel

¹Clinic for Radiology and Reference Center for Mammography, University Hospital and University of Münster, Münster, Germany

,

Anne-Kathrin Brehl

²ScreenPoint Medical, Nijmegen, The Netherlands

,

Walter Heindel

¹Clinic for Radiology and Reference Center for Mammography, University Hospital and University of Münster, Münster, Germany

,

Laura Kerschke

³Institute of Biostatistics and Clinical Research, University of Münster, Münster, Germany

› Author Affiliations

Abstract

Zusammenfassung

Ziel Läsionsbezogene Überprüfung der diagnostischen Wertigkeit eines individuellen Algorithmus künstlicher Intelligenz (KI) in der Dignitätsbewertung von mammografisch detektierten und histologisch abgeklärten Mikroverkalkungen.

Material und Methoden Die retrospektive Studie umfasste 634 Frauen mit abgeschlossener invasiver Abklärungsdiagnostik aufgrund von Mikroverkalkungen einer Mammografie-Screening-Einheit (Juli 2012 – Juni 2018). Das KI-System berechnete für jede Läsion einen Score zwischen 0 und 98. Scores > 0 wurden als KI-positiv betrachtet. Die KI-Performance wurde läsionen-spezifisch auf Basis des positiven prädiktiven Werts der umgesetzten invasiven Abklärungsdiagnostik (PPV3), der Rate falsch negativer und richtig negativer KI-Bewertungen evaluiert.

Ergebnisse Der PPV3 stieg über die Befundstufen an (Befunder: 4a: 21,2 %, 4b: 57,7 %, 5: 100 %, gesamt 30,3 %; KI: 4a: 20,8 %, 4b: 57,8 %, 5: 100 %, gesamt: 30,7 %). Die Rate falsch negativer KI-Bewertungen lag bei 7,2 % (95 %-CI: 4,3 %, 11,4 %), die Rate richtig negativer KI-Bewertungen bei 9,1 % (95 %-CI: 6,6 %, 11,9 %). Diese Raten waren mit 12,5 % bzw. 10,4 % in der Befundstufe 4a am größten. Im Median war der KI-Score für benigne Läsionen am geringsten (61, Interquartilsabstand [IQR]: 45–74) und für invasive Mammakarzinome am höchsten (81, IQR: 64–86). Mediane Scores für das duktale Carcinoma in situ waren: 74 beim geringen (IQR: 63–84), 70 (IQR: 52–79) beim intermediären und 74 (IQR: 66–83) beim hohen Kernmalignitätsgrad.

Schlussfolgerung Bei niedrigster Schwelle führt die Mikrokalk-bezogene KI-Bewertung zu einem zur menschlichen Bewertung vergleichbaren Anstieg des PPV3 über die Befundstufen. Der größte KI-bezogene Verlust an Brustkrebsdetektionen liegt bei geringstgradig suspekten Mikroverkalkungen vor mit einer vergleichbaren Einsparung falsch positiver invasiver Abklärungen. Eine Score-bezogene Stratifizierung maligner Läsionen lässt sich nicht ableiten.

Kernaussagen:

Der PPV3 der Mikrokalkabklärung ist unter KI-Bewertung vergleichbar zur menschlichen Bewertung.
Die Befundstufe 4a unterliegt der ausgeprägtesten KI-induzierten Minderung Screening-positiver sowie Screening-negativer Läsionen.
Die Score-Werte diskriminieren keine Subgruppen histologischer Läsionen.

Zitierweise

Weigel S, Brehl AK, Heindel W et al. Artificial Intelligence for Indication of Invasive Assessment of Calcifications in Mammography Screening. Fortschr Röntgenstr 2023; 195: 38–46

Abstract

Purpose Lesion-related evaluation of the diagnostic performance of an individual artificial intelligence (AI) system to assess mamographically detected and histologically proven calcifications.

Materials and Methods This retrospective study included 634 women of one screening unit (July 2012 – June 2018) who completed the invasive assessment of calcifications. For each leasion, the AI-system calculated a score between 0 and 98. Lesions scored > 0 were classified as AI-positive. The performance of the system was evaluated based on its positive predictive value of invasive assessment (PPV3), the false-negative rate and the true-negative rate.

Results The PPV3 increased across the categories (readers: 4a: 21.2 %, 4b: 57.7 %, 5: 100 %, overall 30.3 %; AI: 4a: 20.8 %, 4b: 57.8 %, 5: 100 %, overall: 30.7 %). The AI system yielded a false-negative rate of 7.2 % (95 %-CI: 4.3 %: 11.4 %) and a true-negative rate of 9.1 % (95 %-CI: 6.6 %; 11.9 %). These rates were highest in category 4a, 12.5 % and 10.4 % retrospectively. The lowest median AI score was observed for benign lesions (61, inter-quartile range (IQR): 45–74). Invasive cancers yielded the highest median AI score (81, IQR: 64–86). Median AI scores for ductal carcinoma in situ were: 74 (IQR: 63–84) for low grade, 70 (IQR: 52–79) for intermediate grade and 74 (IQR: 66–83) for high grade.

Conclusion At the lowest threshold, the AI system yielded calcification-related PPV3 values that increased across categories, similar as seen in human evaluation. The strongest loss in AI-based breast cancer detection was observed for invasively assessed calcifications with the lowest suspicion of malignancy, yet with a comparable decrease in the false-positive rate. An AI-score based stratification of malignant lesions could not be determined.

Key words

breast cancer - mammography screening - artificial intelligence - breast calcifications - positive predictive value - ductal carcinoma in situ

Full Text

References

Literatur
1 Perry N, Broeders M, de Wolf C. et al. (eds). European guidelines for quality assurance in breast cancer screening and diagnosis. Luxembourg: Office for Official Publications of the European Communities; 2006
2 Khil L, Heidrich J, Wellmann I. et al. Incidence of advanced-stage breast cancer in regular participants of a mammography screening program: a prospective register-based study. BMC Cancer 2020; 20: 1-9
3 Katalinic A, Eisemann N, Kraywinkel K. et al. Breast cancer incidence and mortality before and after implementation of the German mammography screening program. Int J Cancer 2020; 147: 709-718
4 Bennani-Baiti B, Baltzer PAT. Künstliche Intelligenz in der Mammadiagnostik. Radiologe 2020; 60: 56-63
5 Hickman SE, Woitek R, Le EPV. et al. Machine Learning for Workflow Applications in Screening Mammography: Systematic Review and Meta-Analysis. Radiology 2022; 302: 88-104
6 Weigel S, Decker T, Korsching E. et al. Calcifications in digital mammographic screening: improvement of early detection of invasive breast cancers?. Radiology 2010; 255: 738-745
7 Tse GM, Tan PH, Pang AL. et al. Calcification in breast lesions: pathologists’ perspective. J Clin Pathol 2008; 61: 145-151
8 D’Orsi CJ, Mendelson EB, Ikeda DM. et al. (eds). Breast Imaging Reporting and Data System: ACR BI-RADS – breast imaging atlas. Reston: American College of Radiology; 2003
9 Jahresbericht Evaluation 2019. Deutsches Mammographie-Screening-Programm. Kooperationsgemeinschaft Mammographie, Berlin, November 2021. Im Internet: https://www.mammo-programm.de/download/downloads/berichte/neu_KOOPMAMMO_Jahresbericht_Eval_2019_20211112_web-Einzelseite_2.pdf
10 Rodriguez-Ruiz A, Lång K, Gubern-Merida A. et al. Stand-Alone Artificial Intelligence for Breast Cancer Detection in Mammography: Comparison With 101 Radiologists. J Natl Cancer Inst 2019; 111: 916-922
11 Kerschke L, Weigel S, Rodriguez-Ruiz A. et al. Using deep learning to assist readers during the arbitration process: a lesion-based retrospective evaluation of breast cancer screening performance. Eur Radiol 2022; 32: 842-852
12 Rodríguez-Ruiz A, Krupinski E, Mordang JJ. et al. Detection of Breast Cancer with Mammography: Effect of an Artificial Intelligence Support System. Radiology 2019; 290: 305-314
13 Weigel S, Decker T, Korsching E. et al. Minimalinvasive biopsy results of “uncertain malignant potential” in digital mammography screening: high prevalence but also high predictive value for malignancy. Fortschr Röntgenstr 2011; 183: 743-748
14 Burnside ES, Ochsner JE, Fowler KJ. et al. Use of microcalcification descriptors in BI-RADS 4th edition to stratify risk of malignancy. Radiology 2007; 242: 388-395
15 Do YA, Jang M, Yun B. et al. Diagnostic Performance of Artificial Intelligence-Based Computer-Aided Diagnosis for Breast Microcalcification on Mammography. Diagnostics 2021; 11: 1409
16 Schönenberger C, Hejduk P, Ciritsis A. et al. Classification of Mammographic Breast Microcalcifications Using a Deep Convolutional Neural Network: A BI-RADS-Based Approach. Invest Radiol 2021; 56: 224-231
17 Tot T, Gere M, Hofmeyer S. et al. The clinical value of detecting microcalcifications on a mammogram. Semin Cancer Biol 2021; 72: 165-174
18 Maxwell AJ, Hilton B, Clements K. et al. Unresected screen-detected ductal carcinoma in situ: Outcomes of 311 women in the Forget-Me-Not 2 study. Breast 2022; 61: 145-155
19 Wallis MG. Artificial intelligence for the real world of breast screening. Eur J Radiol 2021; 144: 109661
20 Lang K, Hofvind S, Rodriguez-Ruiz A. et al. Can artificial intelligence reduce the interval cancer rate?. Eur Radiol 2021; 31: 5940-5947
21 Wanders AJT, Mees W, Bun PAM. et al. Interval cancer detection using a neural network and breast density in women with negative screening mammograms. Radiology 2022; 303: 269-75