Sim-real joint reinforcement transfer for 3D indoor navigation

Zhu, F; Zhu, L; Yang, Y

Sim-real joint reinforcement transfer for 3D indoor navigation

Zhu, F Zhu, L

Yang, Y

Permalink

Publication Type:: Conference Proceeding
Citation:: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019, 2019-June pp. 11380 - 11389
Issue Date:: 2019-06-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Accepted Manuscript versionAdobe PDF (7.2 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Zhu, F	en_US
dc.contributor.author	Zhu, L https://orcid.org/0000-0002-4093-7557	en_US
dc.contributor.author	Yang, Y https://orcid.org/0000-0001-5528-0546	en_US
dc.date.available	2021-01-01T18:02:29Z
dc.date.issued	2019-06-01	en_US
dc.identifier.citation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019, 2019-June pp. 11380 - 11389	en_US
dc.identifier.isbn	9781728132938	en_US
dc.identifier.issn	1063-6919	en_US
dc.identifier.uri	http://hdl.handle.net/10453/139811
dc.description.abstract	© 2019 IEEE. There has been an increasing interest in 3D indoor navigation, where a robot in an environment moves to a target according to an instruction. To deploy a robot for navigation in the physical world, lots of training data is required to learn an effective policy. It is quite labour intensive to obtain sufficient real environment data for training robots while synthetic data is much easier to construct by render-ing. Though it is promising to utilize the synthetic environments to facilitate navigation training in the real world, real environment are heterogeneous from synthetic environment in two aspects. First, the visual representation of the two environments have significant variances. Second, the houseplans of these two environments are quite different. There-fore two types of information,i.e. visual representation and policy behavior, need to be adapted in the reinforce mentmodel. The learning procedure of visual representation and that of policy behavior are presumably reciprocal. We pro-pose to jointly adapt visual representation and policy behavior to leverage the mutual impacts of environment and policy. Specifically, our method employs an adversarial feature adaptation model for visual representation transfer anda policy mimic strategy for policy behavior imitation. Experiment shows that our method outperforms the baseline by 19.47% without any additional human annotations.	en_US
dc.relation.ispartof	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition	en_US
dc.relation.isbasedon	10.1109/CVPR.2019.01165	en_US
dc.rights	info:eu-repo/semantics/openAccess
dc.title	Sim-real joint reinforcement transfer for 3D indoor navigation	en_US
dc.type	Conference Proceeding
utslib.citation.volume	2019-June	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	open_access	*
pubs.publication-status	Published	en_US
pubs.volume	2019-June	en_US

Abstract:

© 2019 IEEE. There has been an increasing interest in 3D indoor navigation, where a robot in an environment moves to a target according to an instruction. To deploy a robot for navigation in the physical world, lots of training data is required to learn an effective policy. It is quite labour intensive to obtain sufficient real environment data for training robots while synthetic data is much easier to construct by render-ing. Though it is promising to utilize the synthetic environments to facilitate navigation training in the real world, real environment are heterogeneous from synthetic environment in two aspects. First, the visual representation of the two environments have significant variances. Second, the houseplans of these two environments are quite different. There-fore two types of information,i.e. visual representation and policy behavior, need to be adapted in the reinforce mentmodel. The learning procedure of visual representation and that of policy behavior are presumably reciprocal. We pro-pose to jointly adapt visual representation and policy behavior to leverage the mutual impacts of environment and policy. Specifically, our method employs an adversarial feature adaptation model for visual representation transfer anda policy mimic strategy for policy behavior imitation. Experiment shows that our method outperforms the baseline by 19.47% without any additional human annotations.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/139811