Computer Vision Faculty Publications

Simple Primitives With Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-Shot Learning

Zhe Liu, UNSW Sydney
Yun Li, Commonwealth Scientific and Industrial Research Organisation
Lina Yao, UNSW Sydney
Xiaojun Chang, University of Technology Sydney
Wei Fang, Jiangnan University
Xiaojun Wu, Jiangnan University
Abdulmotaleb El Saddik, Mohamed Bin Zayed University of Artificial IntelligenceFollow

Document Type

Article

Publication Title

IEEE Transactions on Pattern Analysis and Machine Intelligence

Abstract

The task of Open-World Compositional Zero-Shot Learning (OW-CZSL) is to recognize novel state-object compositions in images from all possible compositions, where the novel compositions are absent during the training stage. The performance of conventional methods degrades significantly due to the large cardinality of possible compositions. Some recent works consider simple primitives (i.e., states and objects) independent and separately predict them to reduce cardinality. However, it ignores the heavy dependence between states, objects, and compositions. In this paper, we model the dependence via feasibility and contextuality. Feasibility-dependence refers to the unequal feasibility of compositions, e.g., hairy is more feasible with cat than with building in the real world. Contextuality-dependence represents the contextual variance in images, e.g., cat shows diverse appearances when it is dry or wet. We design Semantic Attention (SA) to capture the feasibility semantics to alleviate impossible predictions, driven by the visual similarity between simple primitives. We also propose a generative Knowledge Disentanglement (KD) to disentangle images into unbiased representations, easing the contextual bias. Moreover, we complement the independent compositional probability model with the learned feasibility and contextuality compatibly. In the experiments, we demonstrate our superior or competitive performance, SA-and-kD-guided Simple Primitives (SAD-SP), on three benchmark datasets.

First Page

543

Last Page

560

DOI

10.1109/TPAMI.2023.3323012

Publication Date

1-1-2024

Keywords

Attention network, compositional zero-shot learning, generative network, knowledge disentanglement, open world

Comments

IR Deposit conditions:

OA version (pathway a) Accepted version

No embargo

When accepted for publication, set statement to accompany deposit (see policy)

Must link to publisher version with DOI

Publisher copyright and source must be acknowledged

Recommended Citation

Z. Liu et al., "Simple Primitives With Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-Shot Learning," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 1, pp. 543-560, Jan. 2024, doi: 10.1109/TPAMI.2023.3323012.

Additional Links

https://doi.org/10.1109/TPAMI.2023.3323012

Link to Full Text

COinS

Computer Vision Faculty Publications

Simple Primitives With Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-Shot Learning

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Browse

Contribute

Links

Computer Vision Faculty Publications

Simple Primitives With Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-Shot Learning

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Share

Browse

Contribute

Links