Sequence pre-training-based graph neural network for predicting lncRNA-miRNA associations
Briefings in Bioinformatics
MicroRNAs (miRNAs) silence genes by binding to messenger RNAs, whereas long non-coding RNAs (lncRNAs) act as competitive endogenous RNAs (ceRNAs) that can relieve miRNA silencing effects and upregulate target gene expression. The ceRNA association between lncRNAs and miRNAs has been a research hotspot due to its medical importance, but it is challenging to verify experimentally. In this paper, we propose a novel deep learning scheme, i.e. sequence pre-training-based graph neural network (SPGNN), that combines pre-training and fine-tuning stages to predict lncRNA-miRNA associations from RNA sequences and the existing interactions represented as a graph. First, we utilize a sequence-to-vector technique to generate pre-trained embeddings based on the sequences of all RNAs during the pre-training stage. In the fine-tuning stage, we use Graph Neural Network to learn node representations from the heterogeneous graph constructed using lncRNA-miRNA association information. We evaluate our proposed scheme SPGNN on our newly collected animal lncRNA-miRNA association dataset and demonstrate that combining the $k$-mer technique and Doc2vec model for pre-training with the Simple Graph Convolution Network for fine-tuning is effective in predicting lncRNA-miRNA associations. Our approach outperforms state-of-the-art baselines across various evaluation metrics. We also conduct an ablation study and hyperparameter analysis to verify the effectiveness of each component and parameter of our scheme. The complete code and dataset are available on GitHub: https://github.com/zixwang/SPGNN.
ceRNA, graph neural network, lncRNA, miRNA, pre-train
Z. Wang, S. Liang, S. Liu, Z. Meng, J. Wang, and S. Liang, "Sequence pre-training-based graph neural network for predicting lncRNA-miRNA associations", in Briefings in Bioinformatics, vol 24 (5), Sept 2023, doi:10.1093/bib/bbad317