Türk Medline
ADR Yönetimi
ADR Yönetimi

EVALUATION OF THE ACCURACY OF CHATGPT-GENERATED INFORMATION ON HUMAN PAPILLOMAVIRUS: A PHYSICIAN-BASED ASSESSMENT STUDY

Halil Cagri AYBAL, Mehmet YILMAZ, Mehmet DUVARCI, Esra KIRATLI NALBANT, Isa DAGLI, Lutfi TUNC

The Medical Bulletin of Haseki - 2026;64(2):85-91

University of Health Sciences Türkiye, Gulhane Training and Research Hospital, Clinic of Urology, Ankara, Türkiye

 

Aim: Artificial intelligence (AI) applications are widely used to identify solutions to patients' problems. This study aims to evaluate the scientific validity of information that patients can access about human papillomavirus (HPV) -related topics using Chat-Generative Pre-Trained Transformer (ChatGPT). Methods: This study was conducted between July 1 and August 1, 2025. A physician developed a structured set of HPV-related questions. The responses generated by ChatGPT were independently evaluated by three clinicians with clinical experience in HPV management. Each response was rated using a five-point Likert scale based on accuracy and clinical relevance. Inter-rater reliability among reviewers was assessed using Cohen's kappa statistic. Results: The mean scores given by the reviewers for evaluating the accuracy of ChatGPT's answers to HPV-related questions were 4.9+/-0.3, 4.75+/-0.44, and 4.75+/-0.55, respectively. The percentages of correct scores assigned to ChatGPT by the reviewers were 90%, 75%, and 80%, respectively. The approximately equal percentages of correct and incorrect scores were 0, 0, and 5, respectively. The percentages of nearly correct scores were 10, 25, and 15, respectively. Conclusion: Chat-Generative Pre-Trained Transformer 4.0 demonstrated high efficacy in providing general information regarding HPV , with an 81.6% accuracy rate and a 90% near-accuracy rate. Incorporating AI tools into the facilitation of patient access to information could enhance learning processes. However, it is essential that these tools be continuously refined and utilized to complement rather than substitute for the critical judgment of medical professionals and patients.