Aims Recent studies showed that large language models (LLMs) could enhance understanding
of CRC screening, potentially increasing participation rates. However, a limitation
of these studies is that questions posed to LLMs are generated by experts. This study
aims to investigate the effectiveness of ChatGPT-4o in answering CRC screening queries
directly generated by patients.
Methods Ten patients formulated questions across four CRC screening scenarios, which were
posed to ChatGPT in two separate sessions. Responses were assessed by five experts
and ten patients.
Results Experts rated the responses with mean scores of 4.1±1.0 for accuracy, 4.2±1.0 for
completeness, and 4.3±1.0 for comprehensibility. Patients rated responses as complete in 97.5%, understandable in 95%, and trustworthyin 100% of cases. Finally, we evaluated the text similarity in each pair of responses
obtained by ChatGPT-4o in the first and second sessions. The results showed an average
similarity of 86.8±2.7% (range 82%-93%), indicating good consistency of outputs over
time.
Conclusions Despite variability in questions and answers, ChatGPT confirmed good performances
in answering CRC screening queries, even when used directly by patients.