An Advanced Deep Learning-Driven Framework for Automated Detection and Multi-Stage Classification of Diabetic Retinopathy Using High-Resolution Retinal Fundus Imaging
Main Article Content
Abstract
Diabetic retinopathy (DR) remains the leading cause of preventable blindness among the working-age population worldwide, with an estimated 103 million individuals affected globally as of 2020 and projections indicating 160 million by 2045 [1]. Early and accurate staging of DR is critical for timely clinical intervention, yet manual fundus image grading by ophthalmologists is resource-intensive, subjective, and impractical at population scale. This paper presents an advanced deep learning-driven framework, designated DRNet-X, for automated detection and multi-stage classification of diabetic retinopathy using high-resolution retinal fundus images. The proposed framework integrates a modified EfficientNet-B5 convolutional backbone augmented with a novel dual-attention mechanism— incorporating both channel-wise squeeze-and-excitation and spatial transformer attention modules—to enhance lesion-specific feature extraction. A multi-scale feature pyramid network fuses representations across resolution levels, improving detection of microaneurysms, hard exudates, soft exudates, hemorrhages, and neovascularization simultaneously. DRNet-X was trained and evaluated on three benchmark datasets: APTOS 2019, EyePACS, and MESSIDOR-2, encompassing 92,416 fundus images spanning five International Clinical Diabetic Retinopathy (ICDR) severity grades (No DR, Mild, Moderate, Severe, Proliferative DR). The model achieved an overall five-class classification accuracy of 94.7%, quadratic weighted kappa (QWK) of 0.934, area under the ROC curve (AUC) of 0.978, sensitivity of 96.1%, and specificity of 93.8% on the combined held-out test set. Comparative benchmarking against InceptionV3, ResNet-50, VGG-16, DenseNet-121, and EfficientNet-B4 baselines demonstrates consistent superiority of DRNet-X across all evaluation metrics. Ablation studies confirm the additive contribution of each architectural component. The framework provides a robust, scalable, and clinically interpretable solution for DR screening in resource-constrained ophthalmic settings.