A COMPARATIVE ANALYSIS OF U-NET-BASED ARCHITECTURES FOR ROBUST SEGMENTATION OF BLADDER CANCER LESIONS IN MAGNETIC RESONANCE IMAGING

Ishak Pacal, Yigitcan Cakmak

Eurasian Journal of Medicine and Oncology - 2025;9(4):268-283

Department of Computer Engineering, Faculty of Engineering, Igdir University, Igdir, Türkiye

 

Introduction: Bladder cancer (BCa) represents a significant uro-oncological challenge due to its aggressive nature and high recurrence rates. Although magnetic resonance imaging (MRI) is a cornerstone modality in BCa management, the manual segmentation of lesions is time-consuming and suffers from low reproducibility due to inter- and intra-observer variability, morphological heterogeneity, and MRI artifacts. Objective: This study aims to address these limitations by conducting a rigorous comparative evaluation of four distinct U-Net-based deep learning architectures. Methods: The models were evaluated using the publicly available, multi-center FedBCa dataset, comprising 275 T2-weighted MRI scans from 228 patients. Using a standardized training protocol, performance was rigorously assessed with a suite of quantitative metrics, including the Dice coefficient, intersection over union (IoU), and Hausdorff distance, supplemented by qualitative visual comparison. Results: Cross-scale mixer U-Net (CMUNet) achieved the best overall performance, yielding the highest Dice coefficient (0.7937), IoU (0.7033), and boundary delineation accuracy (Hausdorff distance: 8.4550 mm. Architectural trade-offs were evident: CMUNeXt was the most computationally efficient and offered the highest lesion sensitivity (0.9656), whereas Attention U-Net recorded the highest precision (0.8380). Conclusion: CMUNet provides the most balanced and accurate performance for BCa segmentation. However, the optimal architecture choice is application-dependent; high-sensitivity models such as CMUNeXt are ideal for screening, while high-precision models like Attention U-Net are better suited for treatment planning. Deep learning models serve as powerful assistive tools to improve efficiency and objectivity in clinical workflows, though expert oversight remains essential. The top model's accuracy approached, but did not surpass, the inter-rater reliability of human experts (Dice: 0.870).