Multi-Scale Articulated Human Reconstruction by Adaptive Optimization on Sparse Images

Date of Award

4-30-2024

Document Type

Thesis

Degree Name

Master of Science in Machine Learning

Department

Machine Learning

First Advisor

Dr. Xiaodan Liang

Second Advisor

Dr. Bin Gu

Abstract

Recently, 3D Gaussian Splatting has emerged as a highly effective technique for novel view synthesis, particularly in reconstructing human models from monocular videos. This method surpasses traditional NeRF-based implicit representation models in both efficiency and visual fidelity. However, its reliance on an extensive amount of Gaussians introduces challenges such as increased memory consumption, longer training periods, and the potential for noticeable artifacts when adjusting the sampling rate. To mitigate these issues, we have devised an innovative approach that uses JensenShannon divergence to dynamically adjust Gaussian points. This technique efficiently controls the Gaussian count to around 7,000 per person, significantly streamlining the optimization process without compromising quality. Moreover, we employ a refined training strategy that involves rendering images at various resolutions and segmenting the original image into 16 sub-images. This segmentation enhances performance by improving visual fidelity, enabling the comprehensive reconstruction of a person with just 20 images. Additionally, the integration of a 3D filter helps in smoothing out any resulting artifacts, further enhancing the overall quality of the reconstructions.

Comments

Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfilment of the requirements for the M.Sc degree in Machine Learning

Advisors: Xiaodan Liang,Bin Gu

Online access available for MBZUAI patrons

Share

COinS