Leveraging Model Merging and Multi-Modal Medical Imaging Data for Diagnosis & Generation Tasks

Date of Award

4-30-2024

Document Type

Thesis

Degree Name

Master of Science in Computer Vision

Department

Computer Vision

First Advisor

Prof. Fahad Khan

Second Advisor

Prof. Hisham Cholakkal

Abstract

Given the scarcity of well-annotated medical datasets, leveraging transfer learning from broader datasets like ImageNet or pre-trained models like CLIP is crucial. Model soups, which average multiple fine-tuned models, aim to enhance performance on In-Domain tasks and improve robustness against Out-of-Distribution datasets. However, applying these methods to medical imaging faces challenges due to data complexities like heterogeneity and domain shift. To address this issue, a hierarchical merging approach is proposed, aggregating models based on hyperparameter configurations. Additionally, a computationally efficient method using cyclical learning rate scheduling reduces the need for training numerous models. This approach shows significant improvements over model soups, particularly on Out-of-Distribution datasets, while maintaining low computational costs.

Comments

Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfilment of the requirements for the M.Sc degree in Computer Vision

Advisors: Mohammad Yaqub, Karthik Nandakumar

with 2 years embargo period

This document is currently not available here.

Share

COinS