Sequence Pre-training-based Graph Neural Network for Predicting LncRNA-miRNA Associations

Document Type

Dissertation

Abstract

MicroRNAs (miRNAs) silence genes by binding to messenger RNAs (mRNAs), while long non-coding RNAs (lncRNAs) act as competitive endogenous RNAs (ceRNAs) that can relieve miRNA silencing effects and upregulate target gene expression. The ceRNA association between lncRNAs and miRNAs has been a research hotspot due to its medical importance, but it is challenging to verify experimentally. In this paper, we propose a novel deep learning scheme, i.e., Sequence Pre-training-based Graph Neural Network (SPGNN) that combines pre-training and fine-tuning stages to predict lncRNA-miRNA associations from RNA sequence and graph data. We utilize a sequence-to-vector technique to gener-ate pre-trained embeddings based on the sequences of all RNAs during the pre-training stage. In the fine-tuning stage, we use Graph Neural Network to learn node representations from the heterogeneous graph constructed using lncRNA-miRNA association information. We evaluate our scheme SPGNN our newly collected animal lncRNA-miRNA association dataset and demonstrate that combining the k-mers technique and Doc2vec model for pre-training with the Simple Graph Convolution (SGC) Network for fine-tuning is effective in predicting lncRNA-miRNA associations. Our approach outperforms state-of-the-art base-lines on various evaluation metrics. We also conduct an ablation study and hyperparameter analysis to verify the effectiveness of each component and parameter of our scheme.

First Page

i

Last Page

37

Publication Date

6-2023

Comments

Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfillment of the requirements for the M.Sc degree in Computer Vision

Advisors: Dr. Shangsong Liang, Dr. Huan Xiong

Online access for MBZUAI patrons

Share

COinS