J Knee Surg
DOI: 10.1055/a-2693-0756
Original Article

Comparative Efficacy of ChatGPT and Gemini in Addressing Patient Queries on Gonarthrosis and Total Knee Arthroplasty: A Randomized Controlled Trial

Authors

  • Serhat Gurbuz

    1   Department of Orthopedics and Traumatology, Health Sciences University Turkey, Metin Sabancı Baltalimanı Bone Diseases Training and Research Center, İstanbul, Turkey
  • Bulent Karslioglu

    1   Department of Orthopedics and Traumatology, Health Sciences University Turkey, Metin Sabancı Baltalimanı Bone Diseases Training and Research Center, İstanbul, Turkey
  • Ahmet Keskin

    1   Department of Orthopedics and Traumatology, Health Sciences University Turkey, Metin Sabancı Baltalimanı Bone Diseases Training and Research Center, İstanbul, Turkey
  • Niyazi Igde

    1   Department of Orthopedics and Traumatology, Health Sciences University Turkey, Metin Sabancı Baltalimanı Bone Diseases Training and Research Center, İstanbul, Turkey
  • Mustafa Bugra Ayaz

    1   Department of Orthopedics and Traumatology, Health Sciences University Turkey, Metin Sabancı Baltalimanı Bone Diseases Training and Research Center, İstanbul, Turkey
  • Yunus Imren

    2   Department of Orthopedics and Traumatology, Istinye University Turkey, Liv Hospital Vadistanbul, İstanbul, Turkey
Preview

Abstract

The emergence of artificial intelligence (AI) in health care has created novel opportunities for enhancing patient education and alleviating anxiety. This study seeks to evaluate the effectiveness of two leading AI platforms, ChatGPT and Gemini, in delivering accurate and satisfactory responses to patients with gonarthrosis, considering total knee arthroplasty (TKA). A prospective, randomized controlled trial was conducted involving 100 patients diagnosed with gonarthrosis and indicated for TKA. Each patient posed five questions regarding the surgery and postoperative rehabilitation to both ChatGPT and Gemini. Responses were evaluated by two blinded orthopaedic specialists on a 10-point scale for accuracy and patient satisfaction. Patients additionally evaluated their satisfaction with each response using a 10-point scale. The main outcome measures consisted of the average accuracy scores assessed by specialists and the average satisfaction scores reported by patients. Statistical analysis revealed significant differences between ChatGPT and Gemini in both accuracy and patient satisfaction (p < 0.001). ChatGPT demonstrated better performance with a mean accuracy score of 8.7 ± 0.9 compared with Gemini's 7.2 ± 1.1. Patient satisfaction scores aligned with expert evaluations, with ChatGPT achieving a mean satisfaction score of 8.9 ± 0.8 versus Gemini's 7.5 ± 1.2. Notably, ChatGPT excelled in providing comprehensive explanations of surgical procedures (mean score: 9.2 ± 0.7) and postoperative care (9.1 ± 0.8), whereas Gemini performed better in offering concise summaries of recovery timelines (8.4 ± 0.9). This study demonstrates that ChatGPT offers more accurate and satisfactory responses to patient queries regarding gonarthrosis and TKA compared with Gemini. The findings suggest that AI platforms, particularly ChatGPT, can serve as valuable tools in augmenting patient education and potentially reducing preoperative anxiety. Future studies should investigate the incorporation of AI-assisted information delivery into clinical practice and its long-term effects on patient outcomes.



Publication History

Received: 11 December 2024

Accepted: 30 August 2025

Article published online:
19 September 2025

© 2025. Thieme. All rights reserved.

Thieme Medical Publishers, Inc.
333 Seventh Avenue, 18th Floor, New York, NY 10001, USA