Object Learning for 6D Pose Estimation and Grasping from RGB-D Videos of In-hand Manipulation

Patten, T; Park, K; Leitner, M; Wolfram, K; Vincze, M

Object Learning for 6D Pose Estimation and Grasping from RGB-D Videos of In-hand Manipulation

Patten, T

Park, K Leitner, M Wolfram, K Vincze, M

Permalink

Publisher:: IEEE
Publication Type:: Conference Proceeding
Citation:: IEEE International Conference on Intelligent Robots and Systems, 2021, 00, pp. 4831-4838
Issue Date:: 2021-01-01

Closed Access

	Filename	Description	Size
	Object_Learning_for_6D_Pose_Estimation_and_Grasping_from_RGB-D_Videos_of_In-hand_Manipulation.pdf	Published version	5.06 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Patten, T https://orcid.org/0000-0003-1139-9451
dc.contributor.author	Park, K
dc.contributor.author	Leitner, M
dc.contributor.author	Wolfram, K
dc.contributor.author	Vincze, M
dc.date	2021-09-27
dc.date.accessioned	2022-07-04T05:51:05Z
dc.date.available	2022-07-04T05:51:05Z
dc.date.issued	2021-01-01
dc.identifier.citation	IEEE International Conference on Intelligent Robots and Systems, 2021, 00, pp. 4831-4838
dc.identifier.isbn	9781665417143
dc.identifier.issn	2153-0858
dc.identifier.issn	2153-0866
dc.identifier.uri	http://hdl.handle.net/10453/158610
dc.description.abstract	Object models are highly useful for robots as they enable tasks such as detection, pose estimation and manipulation. However, models are not always easily available, especially in real-world domains of operation such as peoples' homes. This work presents a pipeline to generate high-quality object reconstructions from human in-hand manipulation to alleviate the necessity of specialised or expensive hardware. Missing data, due to occlusion or unseen sides, is explicitly handled by incorporating shape completion. We demonstrate the usability of the reconstructions by applying a model-based as well as a CNN-based object pose estimator that is trained on synthetic images by employing state-of-the-art texture synthesis. Using our pipeline to cheaply generate object models and synthetic RGB images for training, we achieve competitive performance compared to baselines that require an elaborate set-up to construct models or large amounts of annotated data. Object grasping is also enabled by learning with the reconstructions in simulation, then executing with a real robot. These evaluations show that our reconstructions are comparable to those made under near-perfect conditions and enable 6D object pose estimation as well as real-world grasping.
dc.language	en
dc.publisher	IEEE
dc.relation.ispartof	IEEE International Conference on Intelligent Robots and Systems
dc.relation.ispartof	2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
dc.relation.isbasedon	10.1109/IROS51168.2021.9635884
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Object Learning for 6D Pose Estimation and Grasping from RGB-D Videos of In-hand Manipulation
dc.type	Conference Proceeding
utslib.citation.volume	00
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Mechanical and Mechatronic Engineering
utslib.copyright.status	closed_access	*
dc.date.updated	2022-07-04T05:51:02Z
pubs.finish-date	2021-10-01
pubs.publication-status	Published
pubs.start-date	2021-09-27
pubs.volume	00

Abstract:

Object models are highly useful for robots as they enable tasks such as detection, pose estimation and manipulation. However, models are not always easily available, especially in real-world domains of operation such as peoples' homes. This work presents a pipeline to generate high-quality object reconstructions from human in-hand manipulation to alleviate the necessity of specialised or expensive hardware. Missing data, due to occlusion or unseen sides, is explicitly handled by incorporating shape completion. We demonstrate the usability of the reconstructions by applying a model-based as well as a CNN-based object pose estimator that is trained on synthetic images by employing state-of-the-art texture synthesis. Using our pipeline to cheaply generate object models and synthetic RGB images for training, we achieve competitive performance compared to baselines that require an elaborate set-up to construct models or large amounts of annotated data. Object grasping is also enabled by learning with the reconstructions in simulation, then executing with a real robot. These evaluations show that our reconstructions are comparable to those made under near-perfect conditions and enable 6D object pose estimation as well as real-world grasping.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/158610