3D Human Pose Estimation Under Occlusions and Partial Views
3D human pose estimation is a crucial component of computer vision research that enables machines to comprehend and interpret human body movements in a three-dimensional space. However, one of the primary obstacles in this field is accurately estimating poses when occlusions and partial views are present. To overcome this challenge, researchers have developed several techniques. The current state-of-the-art model, PARE, employs segmentation as part attention mechanism for 3D pose parameters, which specifically ad- dresses the occluded body parts. Nevertheless, PARE has certain limitations, which have prompted the development of a new architecture called Sequential PARE. This novel architecture uses segmentation as an additional input into the 3D pose parameter branch, resulting in more informative output with fewer empty channels and improved segmentation than PARE’s output. Despite its promising design, Sequential PARE falls short in terms of metrics, although it achieves visually similar results to the original PARE model. To evaluate the capabilities of the proposed architecture in relation to the leading methods in the field, Sequential PARE was also compared to CLIFF, another state-of-the-art model in 3D human pose estimation. This comparison aimed to investigate the performance of the proposed architecture comprehensively. In addition, the impact of various image augmentation techniques, including random cropping and synthetic occlusion, on the performance of these 3D pose estimation methods was explored. These augmentation methods may be crucial in enhancing the performance of pose estimation models, as they expose the models to a wider range of inputs, leading to improved generalization and robustness of the algorithms.
M. Kengeskanov, "3D Human Pose Estimation Under Occlusions and Partial Views", M.S. Thesis, Machine Learning, MBZUAI, Abu Dhabi, UAE, 2023.