Self-Distillation for Robust LiDAR Semantic Segmentation in Autonomous Driving
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
We propose a new and effective self-distillation framework with our new Test-Time Augmentation (TTA) and Transformer based Voxel Feature Encoder (TransVFE) for robust LiDAR semantic segmentation in autonomous driving, where the robustness is mission-critical but usually neglected. The proposed framework enables the knowledge to be distilled from a teacher model instance to a student model instance, while the two model instances are with the same network architecture for jointly learning and evolving. This requires a strong teacher model to evolve in training. Our TTA strategy effectively reduces the uncertainty in the inference stage of the teacher model. Thus, we propose to equip the teacher model with TTA for providing privileged guidance while the student continuously updates the teacher with better network parameters learned by itself. To further enhance the teacher model, we propose a TransVFE to improve the point cloud encoding by modeling and preserving the local relationship among the points inside each voxel via multi-head attention. The proposed modules are generally designed to be instantiated with different backbones. Evaluations on SemanticKITTI and nuScenes datasets show that our method achieves state-of-the-art performance. Our code is publicly available at https://github.com/jialeli1/lidarseg3d.
LiDAR, Self-distillation, Semantic segmentation
J. Li, H. Dai, and Y. Ding. Self-Distillation for Robust LiDAR Semantic Segmentation in Autonomous Driving, in Computer Vision (ECCV 2022) , Lecture Notes in Computer Science, Oct 2022, vol 13688, pp. 659-676, doi:10.1007/978-3-031-19815-1_38
IR conditions: non-described