Self-supervised Knowledge Distillation for Few-shot Learning
Document Type
Conference Proceeding
Publication Title
32nd British Machine Vision Conference, BMVC 2021
Abstract
Real-world contains an overwhelmingly large number of object classes, learning all of which at once is infeasible. Few-shot learning provides a promising learning paradigm due to its ability to quickly adapt to novel distributions with only a few samples. Recent works [8, 39] show that simply learning a good feature embedding can outperform more sophisticated meta-learning and metric learning algorithms for few-shot learning. This paper proposes a self supervised knowledge distillation approach, which learns a strong equivariant feature embedding for few-shot learning, by faithfully encoding inter-class relationships and preserving intra-class diversity. To this end, we follow a two-stage learning process: first, we train our model using a self-supervised auxiliary loss to maximize the entropy of the feature embedding, thus creating an optimal output manifold. In the second stage, we minimize the entropy on feature embedding by bringing self-supervised positive twins together, while constraining the learned manifold with student-teacher distillation. Our experiments show that, even in the first stage, features learnt by self-supervision can outperform current state-of-the-art methods, with further gains achieved by our second stage distillation process.
Publication Date
1-1-2021
Recommended Citation
J. Rajasegaran et al., "Self-supervised Knowledge Distillation for Few-shot Learning," 32nd British Machine Vision Conference, BMVC 2021, Jan 2021.