Document Type

Conference Proceeding

Publication Title

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition


Albeit achieving high predictive accuracy across many challenging computer vision problems, recent studies suggest that deep neural networks (DNNs) tend to make over-confident predictions, rendering them poorly calibrated. Most of the existing attempts for improving DNN calibration are limited to classification tasks and restricted to calibrating in-domain predictions. Surprisingly, very little to no attempts have been made in studying the calibration of object detection methods, which occupy a pivotal space in vision-based security-sensitive, and safety-critical applications. In this paper, we propose a new train-time technique for calibrating modern object detection methods. It is capable of jointly calibrating multiclass confidence and box localization by leveraging their predictive uncertainties. We perform extensive experiments on several in-domain and out-of-domain detection benchmarks. Results demonstrate that our proposed train-time calibration method consistently outperforms several baselines in reducing calibration error for both in-domain and out-of-domain predictions. Our code and models are available at

First Page


Last Page




Publication Date



Vision applications and systems, Location awareness, Computer vision, Uncertainty, Object detection, Detectors, Artificial neural networks, Rendering (computer graphics)


Open Access version available on CVF

Archived, thanks to CVF

Uploaded, May 13, 2024