Fine-grained recognition: Accounting for subtle differences between similar classes

Guolei Sun, ETH Zürich
Hisham Cholakkal, Inception Institute of Artificial Intelligence
Salman Khan, Inception Institute of Artificial Intelligence
Fahad Shahbaz Khan, Inception Institute of Artificial Intelligence
Ling Shao, Inception Institute of Artificial Intelligence

Abstract

The main requisite for fine-grained recognition task is to focus on subtle discriminative details that make the subordinate classes different from each other. We note that existing methods implicitly address this requirement and leave it to a datadriven pipeline to figure out what makes a subordinate class different from the others. This results in two major limitations: First, the network focuses on the most obvious distinctions between classes and overlooks more subtle inter-class variations. Second, the chance of misclassifying a given sample in any of the negative classes is considered equal, while in fact, confusions generally occur among only the most similar classes. Here, we propose to explicitly force the network to find the subtle differences among closely related classes. In this pursuit, we introduce two key novelties that can be easily plugged into existing end-to-end deep learning pipelines. On one hand, we introduce "diversification block"which masks the most salient features for an input to force the network to use more subtle cues for its correct classification. Concurrently, we introduce a "gradient-boosting"loss function that focuses only on the confusing classes for each sample and therefore moves swiftly along the direction on the loss surface that seeks to resolve these ambiguities. The synergy between these two blocks helps the network to learn more effective feature representations. Comprehensive experiments are performed on five challenging datasets. Our approach outperforms existing methods using similar experimental setting on all five datasets.