Computer Vision Faculty Publications

SipMaskv2: Enhanced Fast Image and Video Instance Segmentation

Jiale Cao, School of Electrical and Information Engineering, Tianjin University, Tianjin, ChinaFollow
Yanwei Pang, School of Electrical and Information Engineering, Tianjin University, Tianjin, China
Rao Anwer, Mohamed bin Zayed University of Artificial IntelligenceFollow
Hisham Cholakkal, Mohamed bin Zayed University of Artificial IntelligenceFollow
Fahad Shahbaz Khan, Mohamed bin Zayed University of Artificial IntelligenceFollow
Ling Shao, Terminus Group, Beijing, ChinaFollow

Document Type

Article

Publication Title

IEEE Transactions on Pattern Analysis and Machine Intelligence

Abstract

We propose a fast single-stage method for both image and video instance segmentation, called SipMask, that preserves the instance spatial information by performing multiple sub-region mask predictions. The main module in our method is a light-weight spatial preservation (SP) module that generates a separate set of spatial coefficients for the sub-regions within a bounding-box, enabling a better delineation of spatially adjacent instances. To better correlate mask prediction with object detection, we further propose a mask alignment weighting loss and a feature alignment scheme. In addition, we identify two issues that impede the performance of single-stage instance segmentation and introduce two modules, including a sample selection scheme and an instance refinement module, to address these two issues. Experiments are performed on both image instance segmentation dataset MS COCO and video instance segmentation dataset YouTube-VIS. On MS COCO set, our method achieves a state-of-the-art performance. In terms of real-time capabilities, it outperforms YOLACT by a gain of 3.0% (mask AP) under the similar settings, while operating at a comparable speed. On YouTube-VIS validation set, our method also achieves promising results. The source code is available at . IEEE

First Page

Last Page

DOI

10.1109/TPAMI.2022.3180564

Publication Date

6-8-2022

Comments

IR Deposit conditions:

OA version (pathway a) Accepted version

No embargo

When accepted for publication, set statement to accompany deposit (see policy)

Must link to publisher version with DOI

Publisher copyright and source must be acknowledged

Recommended Citation

J. Cao, Y. Pang, R. M. Anwer, H. Cholakkal, F. S. Khan and L. Shao, "SipMaskv2: Enhanced Fast Image and Video Instance Segmentation," in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, doi: 10.1109/TPAMI.2022.3180564.

Link to Full Text

COinS

Computer Vision Faculty Publications

SipMaskv2: Enhanced Fast Image and Video Instance Segmentation

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Comments

Recommended Citation

Browse

Contribute

Links

Computer Vision Faculty Publications

SipMaskv2: Enhanced Fast Image and Video Instance Segmentation

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Comments

Recommended Citation

Share

Browse

Contribute

Links