Computer Vision Faculty Publications

BEVRefiner: Improving 3D Object Detection in Bird’s-Eye-View via Dual Refinement

Binglu Wang, Northwestern Polytechnical University
Haowen Zheng, Northwestern Polytechnical University
Lei Zhang, Northwestern Polytechnical University
Nian Liu, Mohamed Bin Zayed University of Artificial IntelligenceFollow
Rao Muhammad Anwer, Mohamed Bin Zayed University of Artificial IntelligenceFollow
Hisham Cholakkal, Mohamed Bin Zayed University of Artificial IntelligenceFollow
Yongqiang Zhao, Northwestern Polytechnical University
Zhijun Li, Tongji University

Document Type

Article

Publication Title

IEEE Transactions on Intelligent Transportation Systems

Abstract

Many multi-view camera-based 3D object detection models transform the image features into Bird’s-Eye-View (BEV) via the Lift-Splat-Shoot (LSS) mechanism, which “lifts” 2D camera-view features to the 3D voxel space based on the predicted depth distribution and then “splats” 3D features into a BEV plane for subsequent 3D object detection. However, the BEV feature in such a one-stage view transformation scheme heavily relies on the quality of the predicted depth distribution and 2D camera-view features, which further determines the final detection performance. In this paper, we propose a BEVRefiner model which performs dual refinement for both depth prediction and 2D camera-view features. On the one hand, we perform light-weight depth refinement in the depth distribution frustum space by incorporating 3D context and depth distribution prior. On the other hand, we reproject the BEV feature back to each camera view to enhance 2D image features. In this way, the original camera-view features can be enhanced by implicitly incorporating 3D contexts and multi-view contexts, which cannot be achieved in the original 2D camera view. We also propose to use dominant depth bins only for the reprojection to save computational burden. Finally, we generate the refined BEV feature using the refined depth distribution and camera-view features for more accurate 3D object detection. Our BEVRefiner can be plugged into LSS-based BEV detectors and we perform extensive experiments on the representative model BEVDet, which strongly verified the efficiency of our proposed approach under several settings.

DOI

10.1109/TITS.2024.3394550

Publication Date

1-1-2024

Keywords

3D object detection, BEV, depth prediction, refinement

Recommended Citation

B. Wang et al., "BEVRefiner: Improving 3D Object Detection in Bird’s-Eye-View via Dual Refinement," IEEE Transactions on Intelligent Transportation Systems, Jan 2024.

The definitive version is available at https://doi.org/10.1109/TITS.2024.3394550

This document is currently not available here.

COinS

Computer Vision Faculty Publications

BEVRefiner: Improving 3D Object Detection in Bird’s-Eye-View via Dual Refinement

Document Type

Publication Title

Abstract

DOI

Publication Date

Keywords

Recommended Citation

Browse

Contribute

Links

Computer Vision Faculty Publications

BEVRefiner: Improving 3D Object Detection in Bird’s-Eye-View via Dual Refinement

Authors

Document Type

Publication Title

Abstract

DOI

Publication Date

Keywords

Recommended Citation

Share

Browse

Contribute

Links