Heterogeneous Graph Contrastive Learning With Metapath-Based Augmentations

Document Type

Article

Publication Title

IEEE Transactions on Emerging Topics in Computational Intelligence

Abstract

Heterogeneous graph contrastive learning is an effective method to learn discriminative representations of nodes in heterogeneous graph when the labels are absent. To utilize metapath in contrastive learning process, previous methods always construct multiple metapath-based graphs from the original graph with metapaths, then perform data augmentation and contrastive learning on each graph respectively. However, this paradigm suffers from three defects: 1) It does not consider the augmentation scheme on the whole metapath-based graph set, which hinders them from fully leveraging the information of metapath-based graphs to achieve better performance. 2) The final node embeddings are not optimized from the contrastive objective directly, so they are not guaranteed to be distinctive enough. It leads to suboptimal performance on downstream tasks. 3) Its computational complexity for contrastive objective is high. To tackle these defects, we propose a Heterogeneous Graph Contrastive learning model with Metapath-based Augmentations (HGCMA), which is designed for downstream tasks with a small amount of labeled data. To address the first defect, both semantic-level and node-level augmentation schemes are proposed in our HGCMA for augmentation, where a metapath-based graph and a certain ratio of edges in each metapath-based graph are randomly masked, respectively. To address the second and third defects, we utilize a two-stage attention aggregation graph encoder to output final node embedding and optimize them with contrastive objective directly. Extensive experiments on three public datasets validate the effectiveness of HGCMA when compared with state-of-the-art methods.

First Page

1003

Last Page

1014

DOI

10.1109/TETCI.2023.3322341

Publication Date

10-25-2023

Keywords

Task analysis, Training, Semantics, Representation learning, Mutual information, Data augmentation, Computational intelligence

Comments

IR conditions: non-described

Share

COinS