Computer Vision Faculty Publications

Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation

Nian Liu, Mohamed Bin Zayed University of Artificial IntelligenceFollow
Kepan Nan, Northwestern Polytechnical University
Wangbo Zhao, National University of Singapore
Yuanwei Liu, Northwestern Polytechnical University
Xiwen Yao, Northwestern Polytechnical University
Salman Khan, Mohamed Bin Zayed University of Artificial IntelligenceFollow
Hisham Cholakkal, Mohamed Bin Zayed University of Artificial IntelligenceFollow
Rao Muhammad Anwer, Mohamed Bin Zayed University of Artificial IntelligenceFollow

Document Type

Conference Proceeding

Publication Title

Proceedings of the IEEE International Conference on Computer Vision

Abstract

Few-Shot Video Object Segmentation (FSVOS) aims to segment objects in a query video with the same category defined by a few annotated support images. However, this task was seldom explored. In this work, based on IPMT, a state-of-the-art few-shot image segmentation method that combines external support guidance information with adaptive query guidance cues, we propose to leverage multi-grained temporal guidance information for handling the temporal correlation nature of video data. We decompose the query video information into a clip prototype and a memory prototype for capturing local and long-term internal temporal guidance, respectively. Frame prototypes are further used for each frame independently to handle fine-grained adaptive guidance and enable bidirectional clip-frame prototype communication. To reduce the influence of noisy memory, we propose to leverage the structural similarity relation among different predicted regions and the support for selecting reliable memory frames. Furthermore, a new segmentation loss is also proposed to enhance the category discriminability of the learned prototypes. Experimental results demonstrate that our proposed video IPMT model significantly outperforms previous models on two benchmark datasets. Code is available at https://github.com/nankepan/VIPMT.

First Page

18816

Last Page

18825

DOI

10.1109/ICCV51070.2023.01729

Publication Date

1-1-2023

Recommended Citation

N. Liu et al., "Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation," Proceedings of the IEEE International Conference on Computer Vision, pp. 18816 - 18825, Jan 2023.

The definitive version is available at https://doi.org/10.1109/ICCV51070.2023.01729

This document is currently not available here.

COinS

Computer Vision Faculty Publications

Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Recommended Citation

Browse

Contribute

Links

Computer Vision Faculty Publications

Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Recommended Citation

Share

Browse

Contribute

Links