Open Access
ARTICLE
Reinforcement Learning with an Ensemble of Binary Action Deep Q-Networks
1
Department of Electronics & Communication Engineering, Institute of Technology, University of Kashmir, Srinagar, J&K,
190006, India
2
Department of Computer Science, Faculty of Computers and Information, South Valley University, Qena, Egypt
3
College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University, Alkhrj, KSA
4
Department of Computer Science, Faculty of Computers and Information, Luxor University, Luxor, Egypt
* Corresponding Author: M. Hassaballah. Email:
Computer Systems Science and Engineering 2023, 46(3), 2651-2666. https://doi.org/10.32604/csse.2023.031720
Received 25 April 2022; Accepted 29 June 2022; Issue published 03 April 2023
Abstract
With the advent of Reinforcement Learning (RL) and its continuous progress, state-of-the-art RL systems have come up for many challenging and real-world tasks. Given the scope of this area, various techniques are found in the literature. One such notable technique, Multiple Deep Q-Network (DQN) based RL systems use multiple DQN-based-entities, which learn together and communicate with each other. The learning has to be distributed wisely among all entities in such a scheme and the inter-entity communication protocol has to be carefully designed. As more complex DQNs come to the fore, the overall complexity of these multi-entity systems has increased many folds leading to issues like difficulty in training, need for high resources, more training time, and difficulty in fine-tuning leading to performance issues. Taking a cue from the parallel processing found in the nature and its efficacy, we propose a lightweight ensemble based approach for solving the core RL tasks. It uses multiple binary action DQNs having shared state and reward. The benefits of the proposed approach are overall simplicity, faster convergence and better performance compared to conventional DQN based approaches. The approach can potentially be extended to any type of DQN by forming its ensemble. Conducting extensive experimentation, promising results are obtained using the proposed ensemble approach on OpenAI Gym tasks, and Atari 2600 games as compared to recent techniques. The proposed approach gives a stateof-the-art score of 500 on the Cartpole-v1 task, 259.2 on the LunarLander-v2 task, and state-of-the-art results on four out of five Atari 2600 games.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.