SOON: Scenario Oriented Object Navigation with Graph-based Exploration

Zhu, F; Liang, X; Zhu, Y; Yu, Q; Chang, X; Liang, X

SOON: Scenario Oriented Object Navigation with Graph-based Exploration

Zhu, F Liang, X Zhu, Y Yu, Q Chang, X

Liang, X

Permalink

Publisher:: IEEE
Publication Type:: Conference Proceeding
Citation:: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, 00, pp. 12684-12694
Issue Date:: 2021-11-13

Closed Access

	Filename	Description	Size
	2103.17138.pdf	Published version	3.65 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Zhu, F
dc.contributor.author	Liang, X
dc.contributor.author	Zhu, Y
dc.contributor.author	Yu, Q
dc.contributor.author	Chang, X https://orcid.org/0000-0002-7778-8807
dc.contributor.author	Liang, X
dc.date	2021-06-20
dc.date.accessioned	2022-06-04T22:45:01Z
dc.date.available	2022-06-04T22:45:01Z
dc.date.issued	2021-11-13
dc.identifier.citation	2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, 00, pp. 12684-12694
dc.identifier.isbn	978-1-6654-4509-2
dc.identifier.issn	1063-6919
dc.identifier.issn	2575-7075
dc.identifier.uri	http://hdl.handle.net/10453/157928
dc.description.abstract	The ability to navigate like a human towards a language-guided target from anywhere in a 3D embodied environment is one of the ‘holy grail’ goals of intelligent robots. Most visual navigation benchmarks, however, focus on navigating toward a target from a fixed starting point, guided by an elaborate set of instructions that depicts step-by-step. This approach deviates from real-world problems in which human-only describes what the object and its surrounding look like and asks the robot to start navigation from any-where. Accordingly, in this paper, we introduce a Scenario Oriented Object Navigation (SOON) task. In this task, an agent is required to navigate from an arbitrary position in a 3D embodied environment to localize a target following a scene description. To give a promising direction to solve this task, we propose a novel graph-based exploration (GBE) method, which models the navigation state as a graph and introduces a novel graph-based exploration approach to learn knowledge from the graph and stabilize training by learning sub-optimal trajectories. We also propose a new large-scale benchmark named From Anywhere to Object (FAO) dataset. To avoid target ambiguity, the descriptions in FAO provide rich semantic scene information includes: object attribute, object relationship, region description, and nearby region description. Our experiments reveal that the proposed GBE outperforms various state-of-the-arts on both FAO and R2R datasets. And the ablation studies on FAO validates the quality of the dataset.
dc.language	en
dc.publisher	IEEE
dc.relation	http://purl.org/au-research/grants/arc/DE190100626
dc.relation.ispartof	2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
dc.relation.ispartof	IEEE/CVF Conference on Computer Vision and Pattern Recognition
dc.relation.ispartofseries	IEEE Conference on Computer Vision and Pattern Recognition
dc.relation.isbasedon	10.1109/cvpr46437.2021.01250
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	SOON: Scenario Oriented Object Navigation with Graph-based Exploration
dc.type	Conference Proceeding
utslib.citation.volume	00
utslib.location.activity	Nashville, TN, USA
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	closed_access	*
pubs.consider-herdc	false
dc.date.updated	2022-06-04T22:44:58Z
pubs.finish-date	2021-06-25
pubs.place-of-publication	Piscataway, USA
pubs.publication-status	Published
pubs.start-date	2021-06-20
pubs.volume	00
dc.location	Piscataway, USA

Abstract:

The ability to navigate like a human towards a language-guided target from anywhere in a 3D embodied environment is one of the ‘holy grail’ goals of intelligent robots. Most visual navigation benchmarks, however, focus on navigating toward a target from a fixed starting point, guided by an elaborate set of instructions that depicts step-by-step. This approach deviates from real-world problems in which human-only describes what the object and its surrounding look like and asks the robot to start navigation from any-where. Accordingly, in this paper, we introduce a Scenario Oriented Object Navigation (SOON) task. In this task, an agent is required to navigate from an arbitrary position in a 3D embodied environment to localize a target following a scene description. To give a promising direction to solve this task, we propose a novel graph-based exploration (GBE) method, which models the navigation state as a graph and introduces a novel graph-based exploration approach to learn knowledge from the graph and stabilize training by learning sub-optimal trajectories. We also propose a new large-scale benchmark named From Anywhere to Object (FAO) dataset. To avoid target ambiguity, the descriptions in FAO provide rich semantic scene information includes: object attribute, object relationship, region description, and nearby region description. Our experiments reveal that the proposed GBE outperforms various state-of-the-arts on both FAO and R2R datasets. And the ablation studies on FAO validates the quality of the dataset.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/157928