Türk Medline
ADR Yönetimi
ADR Yönetimi

EVALUATING THE SUCCESS RATE OF THE ONLINE CHAT-BASED ARTIFICIAL INTELLIGENCE PROGRAM CHATGPT IN ANSWERING BASIC QUESTIONS RELATED TO THYROID CANCER

YİĞİT TÜRK, BAHADIR EMRE BAKİ, AHMET CEM DURAL, SERKAN TEKSÖZ, ÖZER MAKAY, RECEP GÖKHAN İÇÖZ, MURAT ÖZDEMİR

Anatolian Journal of General Medical Research - 2025;35(1):83-89

Ege University Faculty of Medicine, Department of General Surgery, İzmir, Türkiye

 

OBJECTIVE ChatGPT, an advanced conversational bot based on artificial intelligence (AI) and a large language model, is designed to understand and generate responses to inputs. This study aims to assess the accuracy of responses provided by ChatGPT to questions that might be asked by patients concerning thyroid cancer. METHODS A total of 27 questions in Turkish, relevant to thyroid cancer and likely to be asked by non-healthcare professionals, were prepared under four categories (general information, diagnosis, treatment, follow-up). These questions were posed to the free public version of ChatGPT, version 3.5. Three experts in endocrine surgery (A.C.D., S.T., Ö.M.) were asked to evaluate the responses. The answers were classified into three categories: appropriate, inappropriate, and insufficient/incomplete. RESULTS Upon evaluating the responses given by ChatGPT to the prepared questions across the four categories, 9 responses (33.3%) were considered “appropriate” by two of the three experts and “insufficient/incomplete” by one expert. Six responses (22.2%) were deemed “appropriate” by two experts and “inappropriate” by one. Overall, a total of 16 responses (59.25%) were considered “appropriate” by at least two experts. CONCLUSION At this stage, AI-based conversational programs like ChatGPT are not seen as capable of replacing a specialist from whom patients receive medical advice.