A Radiogenomics Pipeline for Segmentation of Lung Nodules in CT Scans and Prediction of EGFR Mutation Status

Document Type



Lung cancer is the leading cause of cancer death worldwide, and early-stage detection is essential for a more favorable prognosis. Medical images like CT or PET help the doctor localize the nodule and define radiomic properties, and invasive biopsy determines the tumor's genomic profile to define personalized treatment planning. Radiogenomics is an emerging discipline that combines radiomic features with valuable clinical data (genomic expression, histology, outcomes, etc.) for non-invasive treatment assessment. This study presents a two-step radiogenomics pipeline: 1) a novel mixed 3D architecture (``Attention-to-Recurrence UNet") generates a segmentation mask; 2) deep features from the segmentation model are processed and fed into ML classifiers for EGFR mutation status classification. The primary study dataset is the NSCLC-Radiogenomics dataset; 144 patients have CT images with available segmentation, from which 117 have known EGFR mutation status. NSCLC-Radiomics and Medical Decathlon Lung datasets support the evaluation of the segmentation architecture and finetuning of the final model. The proposed segmentation model outperforms SOTA results with a DSC of 75.26 in lung nodule segmentation thanks to specialized modules that help localize the tumor, exploit spatio-temporal features, and finetuning with pretrained weights. The proposed methodology for classification exceeds other classical Machine Learning and Deep Learning approaches with a ROC-AUC of 0.935. This work solidifies research on radiogenomics in lung cancer with a fully automated pipeline utilizing a limited dataset. Artificial Intelligence techniques can help clinicians identify dangerous tumors before surgery and forecast prognosis in a non-invasive fashion, preventing over or under-treatment and biopsy surgery complications.

First Page


Last Page


Publication Date



Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfillment of the requirements for the M.Sc degree in Machine Learning

Advisors: Dr. Mohammad Yaqub, Dr. Huan Xiong

Online access provided for MBZUAI patrons