Task-Oriented High-Order Context Graph Networks for Few-Shot Human-Object Interaction Recognition

Document Type

Article

Publication Title

IEEE Transactions on Systems, Man, and Cybernetics: Systems

Abstract

Few-shot human-object interaction (FS-HOI) recognition aims at inferring new interactions between human actions and surrounding objects merely with a few available instances. It is beneficial to alleviate the long-tail and combinatorial explosion problems in human-object interaction (HOI). Nevertheless, the existing FS-HOI methods only focus on modeling the relationships between labeled samples and unlabeled samples in the Euclidean domain, which neglects the rich relational structures of the visual information among labeled samples and between human actions and objects. Accordingly, we tackle the few-shot HOI task in the non-Euclidean domain and present a graph-based model, namely, task-oriented high-order context graph network (THCG-Net). It contains a task attention module (TA-Module) and a high-order context graph module (HG-Module). In TA-Module, an attention mechanism is designed by utilizing task information to build a task-oriented space, in which the discriminative information for the current task (episode) is captured by embedding the visual features into the task-oriented space. The HG-Module is proposed to construct a task-level graph and takes the context information as high-order knowledge, which provides discriminative guidance for propagating visual information. It captures the discriminability among different categories while highlights the commonality of related categories adaptively, which effectively transfers knowledge to related categories. Extensive experimental results on two benchmark datasets, HICO-FS and TUHOI-FS, are provided. It demonstrates that our THCG-Net significantly outperforms the state-of-the-art approaches, which proves its impressive effectiveness in recognizing various human actions and surrounding objects in few-shot scenarios.

First Page

5443

Last Page

5455

DOI

10.1109/TSMC.2021.3125343

Publication Date

11-12-2021

Keywords

Few-shot learning, graph neural networks (GNNs), human-object interaction (HOI), meta-learning

Comments

IR Deposit conditions:

OA version (pathway a) Accepted version

No embargo

When accepted for publication, set statement to accompany deposit (see policy)

Must link to publisher version with DOI

Publisher copyright and source must be acknowledged

Share

COinS