Depth-Based Hand Pose Estimation: Methods, Data, and Challenges

Supančič, JS; Rogez, G; Yang, Y; Shotton, J; Ramanan, D

Depth-Based Hand Pose Estimation: Methods, Data, and Challenges

Supančič, JS Rogez, G Yang, Y

Shotton, J Ramanan, D

Permalink

Publisher:: SPRINGER
Publication Type:: Journal Article
Citation:: International Journal of Computer Vision, 2018, 126, (11), pp. 1180-1198
Issue Date:: 2018-11-01

Closed Access

	Filename	Description	Size
	s11263-018-1081-7.pdf	Published version	4.13 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Supančič, JS
dc.contributor.author	Rogez, G
dc.contributor.author	Yang, Y https://orcid.org/0000-0002-0512-880X
dc.contributor.author	Shotton, J
dc.contributor.author	Ramanan, D
dc.date.accessioned	2022-09-10T23:15:36Z
dc.date.available	2022-09-10T23:15:36Z
dc.date.issued	2018-11-01
dc.identifier.citation	International Journal of Computer Vision, 2018, 126, (11), pp. 1180-1198
dc.identifier.issn	0920-5691
dc.identifier.issn	1573-1405
dc.identifier.uri	http://hdl.handle.net/10453/161629
dc.description.abstract	Hand pose estimation has matured rapidly in recent years. The introduction of commodity depth sensors and a multitude of practical applications have spurred new advances. We provide an extensive analysis of the state-of-the-art, focusing on hand pose estimation from a single depth frame. To do so, we have implemented a considerable number of systems, and have released software and evaluation code. We summarize important conclusions here: (1) Coarse pose estimation appears viable for scenes with isolated hands. However, high precision pose estimation [required for immersive virtual reality and cluttered scenes (where hands may be interacting with nearby objects and surfaces) remain a challenge. To spur further progress we introduce a challenging new dataset with diverse, cluttered scenes. (2) Many methods evaluate themselves with disparate criteria, making comparisons difficult. We define a consistent evaluation criteria, rigorously motivated by human experiments. (3) We introduce a simple nearest-neighbor baseline that outperforms most existing systems. This implies that most systems do not generalize beyond their training sets. This also reinforces the under-appreciated point that training data is as important as the model itself. We conclude with directions for future progress.
dc.language	English
dc.publisher	SPRINGER
dc.relation.ispartof	International Journal of Computer Vision
dc.relation.isbasedon	10.1007/s11263-018-1081-7
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0801 Artificial Intelligence and Image Processing
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Depth-Based Hand Pose Estimation: Methods, Data, and Challenges
dc.type	Journal Article
utslib.citation.volume	126
utslib.for	0801 Artificial Intelligence and Image Processing
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access	*
dc.date.updated	2022-09-10T23:15:32Z
pubs.issue	11
pubs.publication-status	Published
pubs.volume	126
utslib.citation.issue	11

Abstract:

Hand pose estimation has matured rapidly in recent years. The introduction of commodity depth sensors and a multitude of practical applications have spurred new advances. We provide an extensive analysis of the state-of-the-art, focusing on hand pose estimation from a single depth frame. To do so, we have implemented a considerable number of systems, and have released software and evaluation code. We summarize important conclusions here: (1) Coarse pose estimation appears viable for scenes with isolated hands. However, high precision pose estimation [required for immersive virtual reality and cluttered scenes (where hands may be interacting with nearby objects and surfaces) remain a challenge. To spur further progress we introduce a challenging new dataset with diverse, cluttered scenes. (2) Many methods evaluate themselves with disparate criteria, making comparisons difficult. We define a consistent evaluation criteria, rigorously motivated by human experiments. (3) We introduce a simple nearest-neighbor baseline that outperforms most existing systems. This implies that most systems do not generalize beyond their training sets. This also reinforces the under-appreciated point that training data is as important as the model itself. We conclude with directions for future progress.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/161629