Subscribe to RSS
DOI: 10.1055/s-0045-1809567
An Institutional Large Language Model for Musculoskeletal MRI Improves Protocol Adherence and Accuracy
Purpose or Learning Objective: Privacy-preserving large language models (PP-LLMs) can potentially assist clinicians with documentation. This study evaluated a privacy-preserving large language model to improve the clinical information on musculoskeletal magnetic resonance imaging radiology request forms and automate protocoling, thus ensuring the most appropriate imaging is performed.
Methods or Background: In this retrospective study, musculoskeletal magnetic resonance imaging radiology request forms randomly collected from June to December 2023 were included. Studies without electronic medical record entries were excluded. An institutional privacy-preserving large language model (Claude Sonnet 3.5) augmented original radiology request forms by mining electronic medical records, and in combination with rule-based processing of the large language model outputs, suggested appropriate protocols using institutional guidelines. Clinical information on the original and privacy-preserving large language model radiology request forms were compared using the Reason for exam Imaging Reporting and Data System by two musculoskeletal radiologists independently (Msk1: 13 years of experience; Msk2: 11 years of experience). These radiologists established a consensus reference standard for protocoling, against which the privacy-preserving large language model and two second-year board-certified radiologists (Rad1 and Rad2) were compared. Gwet's AC1 statistic assessed interrater reliability, and percentage agreement with the reference standard was calculated.
Results or Findings: Overall, 500 musculoskeletal magnetic resonance imaging radiology request forms were analyzed (407 patients; mean age: 50.3 years ± 19.5 [standard deviation]; 202 women) across a range of cases; spine/pelvis (n = 143/500 [28.6%]), upper (n = 169/500 [33.8%]) and lower (n = 188/500 [37.6%]) extremities, and 222/500 (44.4%) required contrast. The clinical information provided in the privacy-preserving large language model with augmented radiology request forms was rated as superior to that in the original requests. Only 0.4 to 0.6% of privacy-preserving large language model radiology request forms were rated as limited/deficient, compared with 12.4 to 22.6% of the original requests (P < 0.001). Almost perfect interobserver agreement was observed for LLM-enhanced requests (AC1: 0.99; 95% confidence interval [CI] 0.99–1.0), with substantial agreement for the original forms (AC1: 0.62; 95% CI 0.56–0.67). For protocoling, Msk1 and Msk2 showed almost perfect agreement on the region/coverage (AC1: 0.96; 95% CI 0.95–0.98) and contrast requirement (AC1: 0.98; 95% CI 0.97–0.99). Compared with the consensus reference standard, protocoling accuracy for the privacy-preserving large language models was 95.8% (95% CI 94.0–97.6%), which was significantly higher than both Rad1 (88.6%; 95% CI 85.8–91.4%) and Rad2 (88.2%; 95% CI 85.4–91.0) (both P < 0.001).
Conclusion: Musculoskeletal magnetic resonance imaging request form augmentation by an institutional large language model provided superior clinical information and improved protocoling accuracy compared with clinician requests and non–musculoskeletal-trained radiologists. Institutional adoption of such large language models could enhance magnetic resonance imaging utilization and patient care.
Publication History
Article published online:
02 June 2025
© 2025. Thieme. All rights reserved.
Thieme Medical Publishers, Inc.
333 Seventh Avenue, 18th Floor, New York, NY 10001, USA