Preventing Overfitting In Transcription Factor Binding Location Prediction Model

Date of Award

4-30-2024

Document Type

Thesis

Degree Name

Master of Science in Machine Learning

Department

Machine Learning

First Advisor

Dr. Martin Takac

Second Advisor

Dr. Gus Xia

Abstract

Forecasting the binding locations of transcription factors is a crucial and expansive field that forms the core of comprehending gene regulatory mechanisms. This interdisciplinary research area merges biology, computational biology, machine learning, and bioinformatics to predict the positions where transcription factors (TFs) interact with DNA sequences, influencing gene expression. The significance of this pursuit is diverse, impacting various biological processes, disease mechanisms, and evolutionary studies. In our model, we intend to adopt a straightforward approach by employing a Convolutional Neural Network (CNN) with Conv1D, a subclass of Conv2D tailored for sequential data. To address the essential need to prevent overfitting in our training data, we plan to incorporate measures such as Dropout layers.

Comments

Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfilment of the requirements for the M.Sc degree in Machine Learning

Advisors: Dr. Martin Takac, Gus Xia

Online access available for MBZUAI patrons

Share

COinS