Türk Medline
ADR Yönetimi
ADR Yönetimi

EVALUATING THE REFERENCE ACCURACY OF LARGE LANGUAGE MODELS IN RADIOLOGY: A COMPARATIVE STUDY ACROSS SUBSPECIALTIES

Yasin Celal GÜNEŞ, Turay CESUR, Eren ÇAMUR

Diagnostic and Interventional Radiology - 2026;32(2):173-181

Kırıkkale Yüksek İhtisas Hospital, Kırıkkale

 

This study aimed to compare six large language models (LLMs) [Chat Generative Pre-trained Transformer (ChatGPT)o1-preview, ChatGPT-4o, ChatGPT-4o with canvas, Google Gemini 1.5 Pro, Claude 3.5 Sonnet, and Claude 3 Opus] in generating radiology references, assessing accuracy, fabrication, and bibliographic completeness.