Türk Medline
ADR Yönetimi
ADR Yönetimi

IMPACT OF LARGE LANGUAGE MODEL ASSISTANCE ON EVALUATION OF COMPLEX MEDICAL LIVING KIDNEY DONOR RECIPIENTS: A PROSPECTIVE, ROLE-STRATIFIED ANALYSIS

Hari Shankar MESHRAM, Vishal BATHEJA, Bhavin MODASIA, Saurabh PURI, Chandani BHAGAT, Sreenivas Rao GADIREDDY, Rajendra Prasad MATHUR

Experimental and Clinical Transplantation - 2026;24(1):43-50

Department of Nephrology, ILBS Vasant Kunj, New Delhi, India

 

Objectives: Living donor kidney transplant often involves decisions that require complex recipient evaluation under substantial cognitive load. Large language models may augment clinician reasoning; however, their role in transplant assessment remains unexplored. Materials and Methods: We conducted a prospective, vignette-based study evaluating ChatGPT (GPT-5, OpenAI) as a decision-support tool to assess living donor kidney transplant recipients. Fourteen nephrology fellows completed 22 expert-validated vignettes unaided and, after a 72-hour washout, completed vignettes with the assistance of ChatGPT. Primary outcomes were accuracy (agreement with expert reference) and completeness (coverage of critical decision elements). Secondary outcomes included unsafe potential (risk of harm from omission/error) and cognitive workload (NASA Task Load Index). Results: We analyzed 308 paired responses. Accuracy improved from 68.4% (SD 7.5) to 86.2% (SD 5.6; mean change of 17.8%, d=1.25). Completeness rose from 63.5% (SD 8.1) to 82.1% (SD 6.9; mean change of 18.6%, d=1.31). Gains were most significant in complex vignettes (change of 24.5%; P = .002). Unsafe potential was present in 18 vignettes (82%) unaided but was reduced to 5 (23%) with ChatGPT5, resulting in an absolute reduction of 59% (P < .001). Omissions fell from 6 (27%) to 2 (9%) (reduction of 18%; P = .01). NASA Task Load Index scores declined substantially, with large effect sizes in mental demand (d=3.55), effort (d=3.65), and frustration (d=2.97). Conclusions: In our study, large language model support significantly enhanced accuracy, completeness, and safety while diminishing cognitive workload among nephrology fellows. This finding highlighted that large language models can operate as both cognitive aids and safety nets in transplant evaluation. Future real-world validation and continuous audits are necessary before integration into clinical workflows.