Abstract
Background The evolution of artificial intelligence has introduced new ways to disseminate health
information, including natural language processing models like ChatGPT. However, the
quality and readability of such digitally generated information remains understudied.
This study is the first to compare the quality and readability of digitally generated
health information against leaflets produced by professionals.
Methodology Patient information leaflets from five ENT UK leaflets and their corresponding ChatGPT
responses were extracted from the Internet. Assessors with various degrees of medical
knowledge evaluated the content using the Ensuring Quality Information for Patients
(EQIP) tool and readability tools including the Flesch-Kincaid Grade Level (FKGL).
Statistical analysis was performed to identify differences between leaflets, assessors,
and sources of information.
Results ENT UK leaflets were of moderate quality, scoring a median EQIP of 23. Statistically
significant differences in overall EQIP score were identified between ENT UK leaflets,
but ChatGPT responses were of uniform quality. Nonspecialist doctors rated the highest
EQIP scores, while medical students scored the lowest. The mean readability of ENT
UK leaflets was higher than ChatGPT responses. The information metrics of ENT UK leaflets
were moderate and varied between topics. Equivalent ChatGPT information provided comparable
content quality, but with reduced readability.
Conclusion ChatGPT patient information and professionally produced leaflets had comparable content,
but large language model content required a higher reading age. With the increasing
use of online health resources, this study highlights the need for a balanced approach
that considers both the quality and readability of patient education materials.
Keywords ChatGPT - patient information leaflets - rhinology leaflets - facial plastic surgery
leaflets - patient information