Student Publications

XMem++: Towards production level interactive video object segmentation

Maksym Bekuzarov, Mohamed bin Zayed University of Artificial IntelligenceFollow

Document Type

Dissertation

Abstract

Despite advancements in user-guided video segmentation, extracting complex objects consistently for highly complex scenes is still a labor-intensive task, especially for industries such as visual effects production. For practical applications, it is not uncommon that a majority of frames of the video sequence need to be manually annotated. However, there is a need for more efficient segmentation methods that can handle complex scenes with high consistency while requiring fewer annotated frames. To address this problem, we introduce a novel semi-supervised video object segmentation (SSVOS) model, XMem++, that improves existing memory-based models, with a new permanent memory module. This work focuses on enhancing the efficiency of video object segmentation by developing a model that can handle multiple user-selected frames with varying appearances of the same object or region. Most existing methods focus on single frame annotations, while our approach can effectively handle multiple user-selected frames with varying appearances of the same object or region. Our method can extract highly consistent results while keeping the required number of frame annotations low. This makes it an efficient solution for video object segmentation, reducing the time and effort required for annotations. The work demonstrates state-of-the-art (SOTA) results on a variety of video sequences, including challenging cases such as partial segmentation and multi-object segmentation as well as long videos. Proposed method is labor-efficient, produces high-quality, temporally-smooth segmentation results, and is able to handle complex scenes with high consistency, requiring few manual annotations.

First Page

Last Page

Publication Date

6-2023

Comments

Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfillment of the requirements for the M.Sc degree in Computer Vision

Advisors: Dr. Hao Li, Dr. Fahad Khan

Online access for MBZUAI patrons

Recommended Citation

M. Bekuzarov, "XMem++: Towards production level interactive video object segmentation", M.S. Thesis, Computer Vision, MBZUAI, Abu Dhabi, UAE, 2023.

Link to Full Text

COinS

Student Publications

XMem++: Towards production level interactive video object segmentation

Document Type

Abstract

First Page

Last Page

Publication Date

Comments

Recommended Citation

Browse

Contribute

Links

Student Publications

XMem++: Towards production level interactive video object segmentation

Authors

Document Type

Abstract

First Page

Last Page

Publication Date

Comments

Recommended Citation

Share

Browse

Contribute

Links