Revisiting EmbodiedQA: A Simple Baseline and Beyond.

Wu, Y; Jiang, L; Yang, Y

Revisiting EmbodiedQA: A Simple Baseline and Beyond.

Wu, Y

Jiang, L Yang, Y

Permalink

Publisher:: Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:: Journal Article
Citation:: IEEE transactions on image processing : a publication of the IEEE Signal Processing Society, 2020, 29, pp. 3984-3992
Issue Date:: 2020-01-23

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download full textAdobe PDF (1.91 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Wu, Y https://orcid.org/0000-0002-1680-8253
dc.contributor.author	Jiang, L
dc.contributor.author	Yang, Y https://orcid.org/0000-0001-5528-0546
dc.date.accessioned	2021-03-18T01:50:21Z
dc.date.available	2021-03-18T01:50:21Z
dc.date.issued	2020-01-23
dc.identifier.citation	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society, 2020, 29, pp. 3984-3992
dc.identifier.issn	1057-7149
dc.identifier.issn	1941-0042
dc.identifier.uri	http://hdl.handle.net/10453/147314
dc.description.abstract	In Embodied Question Answering (EmbodiedQA), an agent interacts with an environment to gather necessary information for answering user questions. Existing works have laid a solid foundation towards solving this interesting problem. But the current performance, especially in navigation, suggests that EmbodiedQA might be too challenging for the contemporary approaches. In this paper, we empirically study this problem and introduce 1) a simple yet effective baseline that achieves promising performance; 2) an easier and practical setting for EmbodiedQA where an agent has a chance to adapt the trained model to a new environment before it actually answers users questions. In this new setting, we randomly place a few objects in new environments, and upgrade the agent policy by a distillation network to retain the generalization ability from the trained model. On the EmbodiedQA v1 benchmark, under the standard setting, our simple baseline achieves very competitive results to the-state-of-the-art; in the new setting, we found the introduced small change in settings yields a notable gain in navigation.
dc.format	Print-Electronic
dc.language	eng
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation	http://purl.org/au-research/grants/arc/DP200100938
dc.relation.ispartof	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
dc.relation.isbasedon	10.1109/tip.2020.2967584
dc.rights	© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	en_US
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	0801 Artificial Intelligence and Image Processing, 0906 Electrical and Electronic Engineering, 1702 Cognitive Sciences
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Revisiting EmbodiedQA: A Simple Baseline and Beyond.
dc.type	Journal Article
utslib.citation.volume	29
utslib.location.activity	United States
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	0906 Electrical and Electronic Engineering
utslib.for	1702 Cognitive Sciences
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	open_access	*
pubs.consider-herdc	false
dc.date.updated	2021-03-18T01:50:19Z
pubs.publication-status	Published
pubs.volume	29

Abstract:

In Embodied Question Answering (EmbodiedQA), an agent interacts with an environment to gather necessary information for answering user questions. Existing works have laid a solid foundation towards solving this interesting problem. But the current performance, especially in navigation, suggests that EmbodiedQA might be too challenging for the contemporary approaches. In this paper, we empirically study this problem and introduce 1) a simple yet effective baseline that achieves promising performance; 2) an easier and practical setting for EmbodiedQA where an agent has a chance to adapt the trained model to a new environment before it actually answers users questions. In this new setting, we randomly place a few objects in new environments, and upgrade the agent policy by a distillation network to retain the generalization ability from the trained model. On the EmbodiedQA v1 benchmark, under the standard setting, our simple baseline achieves very competitive results to the-state-of-the-art; in the new setting, we found the introduced small change in settings yields a notable gain in navigation.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/147314