Category-Contextual Relation Encoding Network for Few-Shot Object Detection

Document Type

Article

Publication Title

IEEE Transactions on Circuits and Systems for Video Technology

Abstract

Few-shot object detection (FSOD) has brought increasing academic interest by recognizing previously unseen novel classes with very limited well-labeled samples. However, most existing methods identify novel classes via some object-specific characteristics in the few provided samples rather than intrinsic inter-class relations between base and novel classes, which heavily degrades the detection performance on novel classes. Moreover, they cannot learn discriminative proposal representations to distinguish base and novel classes, and thus misclassify novel objects as confusable base classes. To tackle the above challenges, we develop a novel Category-contextual Relation Encoding Network (CRE-Net), which is an early attempt to reason inter-class context relationships for FSOD task. To be specific, we propose a novel category-contextual relation encoding mechanism to capture intrinsic inter-class relations between base and novel classes via knowledge aggregation from global category-contextual descriptors. It utilizes intrinsic inter-class contextual relations to adaptively refine the convolution kernel, thus encoding the local semantic context of query image with category-contextual relation as guidance. Furthermore, to explore discriminative representations for base and novel classes, we develop a scarcity-compensatory contrastive proposal loss by incorporating data scarcity of novel classes and proposal semantic consistency with high confidence. This loss could compact object instances from the same category to a tighter cluster, and enhance the space separability of different classes. Extensive experiments on Pascal VOC and COCO datasets verify the state-of-the-art detection performance of our CRE-Net model when compared with other baseline methods.

DOI

10.1109/TCSVT.2024.3378978

Publication Date

1-1-2024

Keywords

Circuits and systems, Detectors, Encoding, Few-shot learning, inter-class relation encoding, Object detection, object detection, Proposals, Semantics, Task analysis

This document is currently not available here.

Share

COinS