Multi-Scale Articulated Human Reconstruction by Adaptive Optimization on Sparse Images
Date of Award
4-30-2024
Document Type
Thesis
Degree Name
Master of Science in Machine Learning
Department
Machine Learning
First Advisor
Dr. Xiaodan Liang
Second Advisor
Dr. Bin Gu
Abstract
Recently, 3D Gaussian Splatting has emerged as a highly effective technique for novel view synthesis, particularly in reconstructing human models from monocular videos. This method surpasses traditional NeRF-based implicit representation models in both efficiency and visual fidelity. However, its reliance on an extensive amount of Gaussians introduces challenges such as increased memory consumption, longer training periods, and the potential for noticeable artifacts when adjusting the sampling rate. To mitigate these issues, we have devised an innovative approach that uses JensenShannon divergence to dynamically adjust Gaussian points. This technique efficiently controls the Gaussian count to around 7,000 per person, significantly streamlining the optimization process without compromising quality. Moreover, we employ a refined training strategy that involves rendering images at various resolutions and segmenting the original image into 16 sub-images. This segmentation enhances performance by improving visual fidelity, enabling the comprehensive reconstruction of a person with just 20 images. Additionally, the integration of a 3D filter helps in smoothing out any resulting artifacts, further enhancing the overall quality of the reconstructions.
Recommended Citation
Y. Fei, "Multi-Scale Articulated Human Reconstruction by Adaptive Optimization on Sparse Images,", Apr 2024.
Comments
Thesis submitted to the Deanship of Graduate and Postdoctoral Studies
In partial fulfilment of the requirements for the M.Sc degree in Machine Learning
Advisors: Xiaodan Liang,Bin Gu
Online access available for MBZUAI patrons