CADC++: Advanced Consensus-Aware Dynamic Convolution for Co-Salient Object Detection

Document Type

Article

Publication Title

IEEE Transactions on Pattern Analysis and Machine Intelligence

Abstract

When given a group of relevant images for co-salient object detection (Co-SOD), humans first summarize consensus cues from the whole group and then search for co-salient objects in each image. Most previous methods do not consider robustness, scalability, or stability in the summarization stage and adopt a simple fusion strategy to fuse consensus and image features in the searching stage. Our work presents a novel consensus-aware dynamic convolution (CADC) model directly from the “summarize and search” perspective to explicitly and effectively perform Co-SOD. For the summarization stage, we extract robust individual image features by a pooling method and integrate them to generate consensus features via self-attention, thus modeling the scalability and stability. Then, we simultaneously learn two types of consensus-aware dynamic kernels, i.e., a common kernel to capture group-wise common knowledge and adaptive kernels to mine image-specific consensus cues. For the second stage, we adopt dynamic convolution to perform object searching. A novel data synthesis strategy is also developed for model training. Although CADC has obtained competitive performance, we argue that incrementally learning dynamic kernels and representations is more intuitive and natural instead of using a simultaneous scheme, thus presenting our CADC++, an extension of CADC. Concretely, we first adopt the common kernel based dynamic convolution to capture coarse common cues as priors and then use the adaptive kernel based dynamic convolution for mining image-specific details. We also propose a recursive guidance strategy to further explore deep interactions among the two kinds of kernels and image features. Besides, we annotate several challenging attributes for Co-SOD datasets and perform attribute-based evaluation and robustness analysis to promote thorough model evaluation for the Co-SOD field. Extensive experimental results on four benchmark datasets verify both the effectiveness and robustness of our proposed method.

First Page

2741

Last Page

2757

DOI

10.1109/TPAMI.2023.3336015

Publication Date

5-1-2024

Keywords

Co-salient object detection, dynamic convolution, saliency detection

Comments

IR Deposit conditions:

OA version (pathway a) Accepted version

No embargo

When accepted for publication, set statement to accompany deposit (see policy)

Must link to publisher version with DOI

Publisher copyright and source must be acknowledged

Share

COinS