Reward space noise for exploration in deep reinforcement learning
- Publisher:
- World Scientific Publishing
- Publication Type:
- Journal Article
- Citation:
- International Journal of Pattern Recognition and Artificial Intelligence, 2021, 35, (10), pp. 1-21
- Issue Date:
- 2021-08-01
Closed Access
Filename | Description | Size | |||
---|---|---|---|---|---|
19110414_7963428280005671.pdf | 2.19 MB |
Copyright Clearance Process
- Recently Added
- In Progress
- Closed Access
This item is closed access and not available.
A fundamental challenge for reinforcement learning (RL) is how to achieve effcient exploration in initially unknown environments. Most state-of-the-art RL algorithms leverage action space noise to drive exploration. The classical strategies are computationally e±cient and straightforward to implement. However, these methods may fail to perform effectively in complex environments. To address this issue, we propose a novel strategy named reward space noise (RSN) for farsighted and consistent exploration in RL. By introducing the stochasticity from reward space, we are able to change agent's understanding about environment and perturb its behaviors. We find that the simple RSN can achieve consistent exploration and scale to complex domains without intensive computational cost. To demonstrate the effectiveness and scalability of the proposed method, we implement a deep Q-learning agent with reward noise and evaluate its exploratory performance on a set of Atari games which are challenging for the naive ε-greedy strategy. The results show that reward noise outperforms action noise in most games and performs comparably in others. Concretely, we found that in the early training, the best exploratory performance of reward noise is obviously better than action noise, which demonstrates that the reward noise can quickly explore the valuable states and aid in finding the optimal.
Please use this identifier to cite or link to this item: