Competitive and cooperative heterogeneous deep reinforcement learning

Zheng, H; Jiang, J; Wei, P; Long, G; Zhang, C

Competitive and cooperative heterogeneous deep reinforcement learning

Zheng, H Jiang, J

Wei, P Long, G

Zhang, C

Permalink

Publication Type:: Conference Proceeding
Citation:: Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, 2020, 2020-May, pp. 1656-1664
Issue Date:: 2020-01-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Published versionAdobe PDF (3.08 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Zheng, H
dc.contributor.author	Jiang, J https://orcid.org/0000-0001-5301-7779
dc.contributor.author	Wei, P
dc.contributor.author	Long, G https://orcid.org/0000-0003-3740-9515
dc.contributor.author	Zhang, C https://orcid.org/0000-0001-5715-7154
dc.date.accessioned	2021-05-11T01:22:22Z
dc.date.available	2021-05-11T01:22:22Z
dc.date.issued	2020-01-01
dc.identifier.citation	Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, 2020, 2020-May, pp. 1656-1664
dc.identifier.isbn	9781450375184
dc.identifier.issn	1548-8403
dc.identifier.issn	1558-2914
dc.identifier.uri	http://hdl.handle.net/10453/148833
dc.description.abstract	Numerous deep reinforcement learning methods have been proposed, including deterministic, stochastic, and evolutionary-based hybrid methods. However, among these various methodologies, there is no clear winner that consistently outperforms the others in every task in terms of effective exploration, sample efficiency, and stability. In this work, we present a competitive and cooperative heterogeneous deep reinforcement learning framework called C2HRL. C2HRL aims to learn a superior agent that exceeds the capabilities of the individual agent in an agent pool through two agent management mechanisms: one competitive, the other cooperative. The competitive mechanism forces agents to compete for computing resources and to explore and exploit diverse regions of the solution space. To support this strategy, resources are distributed to the most suitable agent for that specific task and random seed setting, which results in better sample efficiency and stability. The other mechanic, cooperation, asks heterogeneous agents to share their exploration experiences so that all agents can learn from a diverse set of policies. The experiences are stored in a two-level replay buffer and the result is an overall more effective exploration strategy. We evaluated C2HRL on a range of continuous control tasks from the benchmark Mujoco. The experimental results demonstrate that C2HRL has better sample efficiency and greater stability than three state-of-the-art DRL baselines.
dc.language	en
dc.relation.ispartof	Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
dc.rights	info:eu-repo/semantics/openAccess
dc.title	Competitive and cooperative heterogeneous deep reinforcement learning
dc.type	Conference Proceeding
utslib.citation.volume	2020-May
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/DVC (International)
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - ACRI - Australia China Relations Institute
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
utslib.copyright.status	open_access	*
dc.date.updated	2021-05-11T01:22:20Z
pubs.publication-status	Published
pubs.volume	2020-May

Abstract:

Numerous deep reinforcement learning methods have been proposed, including deterministic, stochastic, and evolutionary-based hybrid methods. However, among these various methodologies, there is no clear winner that consistently outperforms the others in every task in terms of effective exploration, sample efficiency, and stability. In this work, we present a competitive and cooperative heterogeneous deep reinforcement learning framework called C2HRL. C2HRL aims to learn a superior agent that exceeds the capabilities of the individual agent in an agent pool through two agent management mechanisms: one competitive, the other cooperative. The competitive mechanism forces agents to compete for computing resources and to explore and exploit diverse regions of the solution space. To support this strategy, resources are distributed to the most suitable agent for that specific task and random seed setting, which results in better sample efficiency and stability. The other mechanic, cooperation, asks heterogeneous agents to share their exploration experiences so that all agents can learn from a diverse set of policies. The experiences are stored in a two-level replay buffer and the result is an overall more effective exploration strategy. We evaluated C2HRL on a range of continuous control tasks from the benchmark Mujoco. The experimental results demonstrate that C2HRL has better sample efficiency and greater stability than three state-of-the-art DRL baselines.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/148833