Sequence Pre-training-based Graph Neural Network for Predicting LncRNA-miRNA Associations

Document Type



MicroRNAs (miRNAs) silence genes by binding to messenger RNAs (mRNAs), while long non-coding RNAs (lncRNAs) act as competitive endogenous RNAs (ceRNAs) that can relieve miRNA silencing effects and upregulate target gene expression. The ceRNA association between lncRNAs and miRNAs has been a research hotspot due to its medical importance, but it is challenging to verify experimentally. In this paper, we propose a novel deep learning scheme, i.e., Sequence Pre-training-based Graph Neural Network (SPGNN) that combines pre-training and fine-tuning stages to predict lncRNA-miRNA associations from RNA sequence and graph data. We utilize a sequence-to-vector technique to gener-ate pre-trained embeddings based on the sequences of all RNAs during the pre-training stage. In the fine-tuning stage, we use Graph Neural Network to learn node representations from the heterogeneous graph constructed using lncRNA-miRNA association information. We evaluate our scheme SPGNN our newly collected animal lncRNA-miRNA association dataset and demonstrate that combining the k-mers technique and Doc2vec model for pre-training with the Simple Graph Convolution (SGC) Network for fine-tuning is effective in predicting lncRNA-miRNA associations. Our approach outperforms state-of-the-art base-lines on various evaluation metrics. We also conduct an ablation study and hyperparameter analysis to verify the effectiveness of each component and parameter of our scheme.

First Page


Last Page


Publication Date



Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfillment of the requirements for the M.Sc degree in Computer Vision

Advisors: Dr. Shangsong Liang, Dr. Huan Xiong

Online access for MBZUAI patrons