Geometry to the rescue: 3D instance reconstruction from a cluttered scene

Lin Li, CSIRO Data61
Salman Khan, The Australian National University
Nick Barnes, The Australian National University


3D object instance reconstruction from a cluttered 2D scene image is an ill-posed problem. The main challenge is posed by the lack of geometric information in color images and heavy occlusions that lead to incomplete shape details. To deal with this problem, existing works on 3D instance reconstruction directly learn the mapping between the intensity image and the corresponding 3D volume model. Different from these works, we propose to explicitly incorporate 2.5D geometric cues, such as the surface normal, relative depth, and height, while generating full 3D shapes from 2D images. With an intermediate step focused on estimating these 2.5D geometric features, we propose a novel convolutional neural network design that progressively moves from 2D to full 3D estimation. Our model automatically generates instance-specific surface normal maps, relative depth, and height that are compactly encoded within our network design and consequently used to improve the 3D instance reconstruction. Our experimental results on the large-scale synthetic SUNCG dataset and the real-world NYU depth v2 dataset demonstrate the effectiveness of the proposed approach where it beats the state-of-the-art Factored3D network [15].