Machine Learning Faculty Publications

VMAN: A Virtual Mainstay Alignment Network for Transductive Zero-Shot Learning

Guo Sen Xie, Nanjing University of Science and Technology & Mohamed bin Zayed University of Artificial IntelligenceFollow
Xu Yao Zhang, Institute of Automation Chinese Academy of Sciences
Yazhou Yao, Nanjing University of Science and Technology
Zheng Zhang, Harbin Institute of Technology
Fang Zhao, Inception Institute of Artificial Intelligence
Ling Shao, Inception Institute of Artificial Intelligence

Document Type

Article

Publication Title

IEEE Transactions on Image Processing

Abstract

Transductive zero-shot learning (TZSL) extends conventional ZSL by leveraging (unlabeled) unseen images for model training. A typical method for ZSL involves learning embedding weights from the feature space to the semantic space. However, the learned weights in most existing methods are dominated by seen images, and can thus not be adapted to unseen images very well. In this paper, to align the (embedding) weights for better knowledge transfer between seen/unseen classes, we propose the virtual mainstay alignment network (VMAN), which is tailored for the transductive ZSL task. Specifically, VMAN is casted as a tied encoder-decoder net, thus only one linear mapping weights need to be learned. To explicitly learn the weights in VMAN, for the first time in ZSL, we propose to generate virtual mainstay (VM) samples for each seen class, which serve as new training data and can prevent the weights from being shifted to seen images, to some extent. Moreover, a weighted reconstruction scheme is proposed and incorporated into the model training phase, in both the semantic/feature spaces. In this way, the manifold relationships of the VM samples are well preserved. To further align the weights to adapt to more unseen images, a novel instance-category matching regularization is proposed for model re-training. VMAN is thus modeled as a nested minimization problem and is solved by a Taylor approximate optimization paradigm. In comprehensive evaluations on four benchmark datasets, VMAN achieves superior performances under the (Generalized) TZSL setting.

First Page

4316

Last Page

4329

DOI

10.1109/TIP.2021.3070231

Publication Date

4-9-2021

Keywords

transductive, virtual sample generation, Zero-shot learning

Comments

IR Deposit conditions:

OA version (pathway a) Accepted version

No embargo

When accepted for publication, set statement to accompany deposit (see policy)

Must link to publisher version with DOI

Publisher copyright and source must be acknowledged

Recommended Citation

G. -S. Xie, X. -Y. Zhang, Y. Yao, Z. Zhang, F. Zhao and L. Shao, "VMAN: A Virtual Mainstay Alignment Network for Transductive Zero-Shot Learning," in IEEE Transactions on Image Processing, vol. 30, pp. 4316-4329, 2021, doi: 10.1109/TIP.2021.3070231.

Additional Links

IEEE Link: https://doi.org/10.1109/TIP.2021.3070231

Link to Full Text

COinS

Machine Learning Faculty Publications

VMAN: A Virtual Mainstay Alignment Network for Transductive Zero-Shot Learning

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Browse

Contribute

Links

Machine Learning Faculty Publications

VMAN: A Virtual Mainstay Alignment Network for Transductive Zero-Shot Learning

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Share

Browse

Contribute

Links