Human feedback as action assignment in interactive reinforcement learning

Raza, SA; Williams, MA

Human feedback as action assignment in interactive reinforcement learning

Raza, SA Williams, MA

Permalink

Publisher:: ASSOC COMPUTING MACHINERY
Publication Type:: Journal Article
Citation:: ACM Transactions on Autonomous and Adaptive Systems, 2020, 14, (4)
Issue Date:: 2020-09-01

Closed Access

	Filename	Description	Size
	3404197.pdf	Published version	4 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Raza, SA
dc.contributor.author	Williams, MA
dc.date.accessioned	2021-02-07T19:22:16Z
dc.date.available	2021-02-07T19:22:16Z
dc.date.issued	2020-09-01
dc.identifier.citation	ACM Transactions on Autonomous and Adaptive Systems, 2020, 14, (4)
dc.identifier.issn	1556-4665
dc.identifier.issn	1556-4703
dc.identifier.uri	http://hdl.handle.net/10453/145898
dc.description.abstract	© 2020 ACM. Teaching by demonstrations and teaching by assigning rewards are two popular methods of knowledge transfer in humans. However, showing the right behaviour (by demonstration) may appear more natural to a human teacher than assessing the learner's performance and assigning a reward or punishment to it. In the context of robot learning, the preference between these two approaches has not been studied extensively. In this article, we propose a method that replaces the traditional method of reward assignment with action assignment (which is similar to providing a demonstration) in interactive reinforcement learning. The main purpose of the suggested action is to compute a reward by seeing if the suggested action was followed by the self-acting agent or not. We compared action assignment with reward assignment via a user study conducted over the web using a two-dimensional maze game. The logs of interactions showed that action assignment significantly improved users' ability to teach the right behaviour. The survey results showed that both action and reward assignment seemed highly natural and usable, reward assignment required more mental effort, repeatedly assigning rewards and seeing the agent disobey commands caused frustration in users, and many users desired to control the agent's behaviour directly.
dc.language	English
dc.publisher	ASSOC COMPUTING MACHINERY
dc.relation	http://purl.org/au-research/grants/arc/DP160102693
dc.relation.ispartof	ACM Transactions on Autonomous and Adaptive Systems
dc.relation.isbasedon	10.1145/3404197
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0801 Artificial Intelligence and Image Processing, 1702 Cognitive Sciences
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Human feedback as action assignment in interactive reinforcement learning
dc.type	Journal Article
utslib.citation.volume	14
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	1702 Cognitive Sciences
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	/University of Technology Sydney
utslib.copyright.status	closed_access	*
dc.date.updated	2021-02-07T19:21:52Z
pubs.issue	4
pubs.publication-status	Published
pubs.volume	14
utslib.citation.issue	4

Abstract:

© 2020 ACM. Teaching by demonstrations and teaching by assigning rewards are two popular methods of knowledge transfer in humans. However, showing the right behaviour (by demonstration) may appear more natural to a human teacher than assessing the learner's performance and assigning a reward or punishment to it. In the context of robot learning, the preference between these two approaches has not been studied extensively. In this article, we propose a method that replaces the traditional method of reward assignment with action assignment (which is similar to providing a demonstration) in interactive reinforcement learning. The main purpose of the suggested action is to compute a reward by seeing if the suggested action was followed by the self-acting agent or not. We compared action assignment with reward assignment via a user study conducted over the web using a two-dimensional maze game. The logs of interactions showed that action assignment significantly improved users' ability to teach the right behaviour. The survey results showed that both action and reward assignment seemed highly natural and usable, reward assignment required more mental effort, repeatedly assigning rewards and seeing the agent disobey commands caused frustration in users, and many users desired to control the agent's behaviour directly.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/145898