Leveraging Model Merging and Multi-Modal Medical Imaging Data for Diagnosis & Generation Tasks
Date of Award
4-30-2024
Document Type
Thesis
Degree Name
Master of Science in Computer Vision
Department
Computer Vision
First Advisor
Prof. Fahad Khan
Second Advisor
Prof. Hisham Cholakkal
Abstract
Given the scarcity of well-annotated medical datasets, leveraging transfer learning from broader datasets like ImageNet or pre-trained models like CLIP is crucial. Model soups, which average multiple fine-tuned models, aim to enhance performance on In-Domain tasks and improve robustness against Out-of-Distribution datasets. However, applying these methods to medical imaging faces challenges due to data complexities like heterogeneity and domain shift. To address this issue, a hierarchical merging approach is proposed, aggregating models based on hyperparameter configurations. Additionally, a computationally efficient method using cyclical learning rate scheduling reduces the need for training numerous models. This approach shows significant improvements over model soups, particularly on Out-of-Distribution datasets, while maintaining low computational costs.
Recommended Citation
S. Sanjeev, "Leveraging Model Merging and Multi-Modal Medical Imaging Data for Diagnosis & Generation Tasks,", Apr 2024.
Comments
Thesis submitted to the Deanship of Graduate and Postdoctoral Studies
In partial fulfilment of the requirements for the M.Sc degree in Computer Vision
Advisors: Mohammad Yaqub, Karthik Nandakumar
with 2 years embargo period