A top-down approach to articulated human pose estimation and tracking

Publication Type:
Conference Proceeding
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019, 11130 LNCS pp. 227 - 234
Issue Date:
Full metadata record
© 2019, Springer Nature Switzerland AG. Both the tasks of multi-person human pose estimation and pose tracking in videos are quite challenging. Existing methods can be categorized into two groups: top-down and bottom-up approaches. In this paper, following the top-down approach, we aim to build a strong baseline system with three modules: human candidate detector, single-person pose estimator and human pose tracker. Firstly, we choose a generic object detector among state-of-the-art methods to detect human candidates. Then, cascaded pyramid network is used to estimate the corresponding human pose. Finally, we use a flow-based pose tracker to render keypoint-association across frames, i.e., assigning each human candidate a unique and temporally-consistent id, for the multi-target pose tracking purpose. We conduct extensive ablative experiments to validate various choices of models and configurations. We take part in two ECCV’18 PoseTrack challenges (https://posetrack.net/workshops/eccv2018/posetrack_eccv_2018_results.html ): pose estimation and pose tracking.
Please use this identifier to cite or link to this item: