Maximum Entropy Reinforcement Learning with Evolution Strategies

Publisher:
IEEE
Publication Type:
Conference Proceeding
Citation:
Proceedings of the International Joint Conference on Neural Networks, 2020, 00, pp. 1-8
Issue Date:
2020-07-01
Full metadata record
© 2020 IEEE. Evolution strategies (ES) have recently raised attention in solving challenging tasks with low computation costs and high scalability. However, it is well-known that evolution strategies reinforcement learning (RL) methods suffer from low stability. Without careful consideration, ES methods are sensitive to local optima and are unstable in learning. Therefore, there is an urgent need for improving the stability of ES methods in solving RL problems. In this paper, we propose a simple yet efficient ES method to stabilize the learning. Specifically, we propose a framework to incorporate the maximum entropy reinforcement learning with evolution strategies and derive an efficient entropy calculation method for linear policies. We further present a practical algorithm called maximum entropy evolution policy search based on the proposed framework, which is efficient and stable for policy search in continuous control. Our algorithm shows high stability across different random seeds and can obtain comparable results in performance against some existing derivative-free RL methods on several of the well-known benchmark MuJoCo robotic control tasks.
Please use this identifier to cite or link to this item: