INTEROBSERVER RELIABILITY OF THE KELLGREN-LAWRENCE CLASSIFICATION IN KNEE OSTEOARTHRITIS: A COMPARISON BETWEEN ORTHOPEDIC SURGEONS AND ARTIFICIAL INTELLIGENCE

Şafak SAYAR, Mustafa BOZ, Yasemin Begüm TOPKARCI, Suat BATAR, Necdet DEMİR

Eurasian Journal of Medical Investigation - 2026;10(1):80-83

Department of Orthopaedics and Traumatology, Biruni University Faculty of Medicine Hospital, Istanbul

Objectives: To evaluate the interobserver reliability of the Kellgren-Lawrence (KL) classification among orthopedic surgeons and to compare their assessments with artificial intelligence (AI) systems. Methods: One hundred anteroposterior weight-bearing knee radiographs from patients aged 65 years and older were retrospectively analyzed. Four orthopedic surgeons and two AI systems independently graded all radiographs according to the KL classification and were blinded to clinical information and to each other's evaluations. Interobserver agreement was assessed using quadratically weighted Cohen's kappa and intraclass correlation coefficients (ICC). Results: Interobserver agreement among orthopedic surgeons demonstrated good reliability (mean weightedkappa=0.780; ICC=0.784). Agreement between the orthopedic consensus and ChatGPT was moderate (kappa=0.481), whereas Gemini demonstrated moderate-to-good agreement (kappa=0.561). Agreement between the two AI systems was also moderate (kappa=0.484). Conclusion: The KL classification demonstrated good reliability among orthopedic surgeons. AI systems demonstrated moderate agreement with orthopedic experts and may serve as supportive screening tools rather than as diagnostic replacements.

Ana Sayfa Hakkımızda İndekslenen Dergiler Detaylı Arama İlgili Kaynaklar İndekse Başvuru İletişim

Türk Medline Ulusal Sağlık Bilimleri – Süreli Yayınlar Veritabanı ile ilgili soru ve istekleriniz için
info@turkmedline.net adresine e-posta iletebilirsiniz