Potential based reward shaping using learning to rank

Raza, SA; Williams, MA

Potential based reward shaping using learning to rank

Raza, SA

Williams, MA

Permalink

Publication Type:: Conference Proceeding
Citation:: ACM/IEEE International Conference on Human-Robot Interaction, 2017, pp. 261 - 262
Issue Date:: 2017-03-06

Closed Access

	Filename	Description	Size
	Potential Based Reward Shaping Using Learning to Rank.pdf	Published version	645.57 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Raza, SA https://orcid.org/0000-0001-6570-4808	en_US
dc.contributor.author	Williams, MA https://orcid.org/0000-0002-1047-0503	en_US
dc.date.issued	2017-03-06	en_US
dc.identifier.citation	ACM/IEEE International Conference on Human-Robot Interaction, 2017, pp. 261 - 262	en_US
dc.identifier.isbn	9781450348850	en_US
dc.identifier.uri	http://hdl.handle.net/10453/127420
dc.description.abstract	© 2017 Authors. This paper presents a novel method for the computation of potential function using human input for potential based reward shaping. It defines a ranking over state space which is used to define a potential function. Specifically, it seeks multiple, partial to full, rankings of robot's states from a user in a HRI scenario. These rankings are used to learn a ranking model using a learning-to-rank algorithm. The ranking model is used to define a complete ranking of states. From the ranked states, a potential function is computed using a mapping function. For the proof of concept, we compared it with a base-line reinforcement learner in a simulated domain. The empirical results showed that the proposed method clearly outperformed the benchmark.	en_US
dc.relation.ispartof	ACM/IEEE International Conference on Human-Robot Interaction	en_US
dc.relation.isbasedon	10.1145/3029798.3038421	en_US
dc.title	Potential based reward shaping using learning to rank	en_US
dc.type	Conference Proceeding
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
utslib.for	0804 Data Format	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US

Abstract:

© 2017 Authors. This paper presents a novel method for the computation of potential function using human input for potential based reward shaping. It defines a ranking over state space which is used to define a potential function. Specifically, it seeks multiple, partial to full, rankings of robot's states from a user in a HRI scenario. These rankings are used to learn a ranking model using a learning-to-rank algorithm. The ranking model is used to define a complete ranking of states. From the ranked states, a potential function is computed using a mapping function. For the proof of concept, we compared it with a base-line reinforcement learner in a simulated domain. The empirical results showed that the proposed method clearly outperformed the benchmark.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/127420