Zero-shot object detection: joint recognition and localization of novel concepts
Document Type
Article
Publication Title
International Journal of Computer Vision
Abstract
Zero shot learning (ZSL) identifies unseen objects for which no training images are available. Conventional ZSL approaches are restricted to a recognition setting where each test image is categorized into one of several unseen object classes. We posit that this setting is ill-suited for real-world applications where unseen objects appear only as a part of a complete scene, warranting both ‘recognition’ and ‘localization’ of the unseen category. To address this limitation, we introduce a new ‘Zero-Shot Detection’ (ZSD) problem setting, which aims at simultaneously recognizing and locating object instances belonging to novel categories, without any training samples. We introduce an integrated solution to the ZSD problem that jointly models the complex interplay between visual and semantic domain information. Ours is an end-to-end trainable deep network for ZSD that effectively overcomes the noise in the unsupervised semantic descriptions. To this end, we utilize the concept of meta-classes to design an original loss function that achieves synergy between max-margin class separation and semantic domain clustering. In order to set a benchmark for ZSD, we propose an experimental protocol for the large-scale ILSVRC dataset that adheres to practical challenges, e.g., rare classes are more likely to be the unseen ones. Furthermore, we present a baseline approach extended from conventional recognition to the ZSD setting. Our extensive experiments show a significant boost in performance (in terms of mAP and Recall) on the imperative yet difficult ZSD problem on ImageNet detection, MSCOCO and FashionZSD datasets.
First Page
2979
Last Page
2999
DOI
10.1007/s11263-020-01355-6
Publication Date
12-1-2020
Keywords
Deep learning, Loss function, Zero-shot learning, Zero-shot object detection
Recommended Citation
S. Rahman, S. H. Khan, and F. Porikli, “Zero-shot object detection: joint recognition and localization of novel concepts,” International Journal of Computer Vision , vol. 128, no. 12, pp. 2979–2999, Dec. 2020, doi: doi.org/10.1007/s11263-020-01355-6
Comments
IR deposit conditions: