Exploring the Potential of Deep Learning for Biomedical Image Segmentation

Document Type



Medical image segmentation is the fundamental task in biomedical imaging analysis by identifying and isolating regions of interest from images such as MRI or CT scans. This process is vital for precise diagnosis and effective treatment planning for various conditions, including tumors and anomalies. However, medical image segmentation is a challenging task due to the complex and diverse nature of these images. They contain structures with varying sizes, shapes, and intensities, as well as noise, artifacts, and missing data. Traditional image segmentation methods that rely on handcrafted features and heuristic algorithms have limitations in accurately segmenting medical images due to their complexity and variability. Recent advances in deep learning-based techniques have shown promising results in medical image segmentation by automatically learning feature representations. However, these techniques require innovation in network architecture, learning framework, and training procedures to improve the segmentation process. In this thesis, we aim to introduce new deep learning-based methods to improve medical image segmentation, with a focus on frameworks for high-resolution and volumetric medical image segmentation. The proposed techniques aim to segment various structures, including organs, brain tumors, skin lesions, polyps, and retinal vessels, using medical image data from different modalities such as CT, MRI, endoscopy imaging, and fundus imaging. The thesis proposes two novel deep learning-based techniques. The first one is called TransResNet, which combines a transformer and CNN to effectively segment high-resolution medical images. The proposed method introduces a Cross Grafting Module (CGM) that generates grafted features by combining global semantic and low-level spatial details. This approach achieves state-of-the-art or competitive results on several segmentation tasks, including skin lesions, retinal vessels, and polyp segmentation. The second technique proposed in the thesis is a versatile generic architecture called Y-CA-Net, which can segment volumetric medical images of different modalities (CT, MRI). This method is based on any two encoders and a decoder backbone, with one encoder branch that extracts local features using convolution filters and the other encoder branch that learns global interactions using an attention mechanism. The proposed approach aims to fully exploit the strengths of both convolution and attention mechanisms for volumetric segmentation. The model introduces a simple yet effective Cross Feature Mixer Module (CFMM) to learn semantically enriched features by mixing local and global feature representations. The proposed model has shown significant improvement across different benchmark datasets for multi-organ and brain tumor segmentation compared to existing state-of-the-art methods.

Publication Date



Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfillment of the requirements for the M.Sc degree in Computer Vision

Advisors: Dr. Min Xu, Dr. Mohammad Yaqub

1 year embargo period

This document is currently not available here.