Domain Generalization in Diabetic Retinopathy Grading

Date of Award


Document Type


Degree Name

Master of Science in Machine Learning


Machine Learning

First Advisor

Dr. Muhammad Haris

Second Advisor

Dr. Shijian Lu


Diabetic Retinopathy (DR), constituting 5% of global blindness cases, arises from prolonged Diabetes Mellitus (DM). Recent findings indicate a significant rise projected to escalate to an alarming 1.3 billion by 2050. As the DM burden intensifies, so does the prevalence of DR, necessitating a thorough exploration of efficient grading methodologies. While numerous deep learning approaches have sought to enhance traditional DR grading methods, they often falter when confronted with new unseen test data featuring different distributions to training distributions thus impeding their widespread application. In this study, we introduce a novel deep learning method for achieving domain generalization (DG) in DR classification and make the following contributions. First, we propose a new way of generating image-to-image diagnostically relevant fundus augmentations conditioned on the grade of the original fundus image. These augmentations are tailored to emulate the types of shifts in DR datasets. Second, we address the limitations of standard classification loss in DG for DR fundus datasets by proposing a new DG-specific lossalignment loss. Third, we tackle the coupled problem of data imbalance across domains and categories by proposing to employ Focal loss which seamlessly integrates with our new alignment loss. Fourth, we address the inevitable issue of observer variability in DR diagnosis, which consequently induces label noise, hindering the model’s ability to learn domain-agnostic features that can be generalized to unseen target domains. To mitigate this challenge, we propose leveraging SSL pretraining, even in scenarios where only a limited dataset of non-DR fundus images is accessible for pretraining. Our method demonstrates significant improvements, yielding gains of 5.5%, 3.8%, and 5.8% in accuracy, AUC, and F1-score, respectively, over the strong Empirical Risk Minimization (ERM) baseline. Furthermore, compared to GDRNet, a recently proposed stateof-the-art DG method for DR, our approach achieves gains of 3.5%, 1.5%, and 3.5% in accuracy, AUC, and F1-score, respectively. These results underscore the relative superiority of our method in addressing the challenges posed by domain shift in DR diagnosis.


Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfilment of the requirements for the M.Sc degree in Machine Learning

Advisors: Muhammad Haris,Shijian Lu

with 1 year embargo period

This document is currently not available here.