PoseGate-Former: Transformer Encoder with Trainable Gate for 3D Human Pose Estimation Using Weakly Supervised Learning

Publisher:
Springer International Publishing
Publication Type:
Chapter
Citation:
Neural Information Processing, 2021, 1517 CCIS, pp. 266-274
Issue Date:
2021-01-01
Full metadata record
Weakly supervised learning for 3D human pose estimation can learn a real human structure, but it generally has lower accuracy on reconstructing 3D poses. In this work, we present a 3D pose estimation model using a Transformer encoder based architecture with a trainable gate, PoseGate-Former. The model is trained using individual images from a weakly supervised learning approach. It can reduce possibility of overfitting on some action categories due to the addition of a trainable gate to the Transformer encoder. We evaluated this model on two benchmark datasets: Human3.6M and HumanEva-I. The experimental results show that this model can obtain substantially better accuracy in all action categories of 3D human poses in the datasets compared with some fully-supervised 3D pose estimation approaches.
Please use this identifier to cite or link to this item: