Open Access
ARTICLE
Collision Observation-Based Optimization of Low-Power and Lossy IoT Network Using Reinforcement Learning
1 Department of Information and Communication Engineering, Yeungnam University, Gyeongsan-si, 8541, South Korea
2 School of Intelligent Mechatronics Engineering, Sejong University, Seoul, 05006, South Korea
3 Department of Software and Communications Engineering, Hongik University, Seoul, 30016, South Korea
* Corresponding Author: Byung-Seo Kim. Email:
(This article belongs to the Special Issue: Machine Learning-based Intelligent Systems: Theories, Algorithms, and Applications)
Computers, Materials & Continua 2021, 67(1), 799-814. https://doi.org/10.32604/cmc.2021.014751
Received 13 October 2020; Accepted 15 November 2020; Issue published 12 January 2021
Abstract
The Internet of Things (IoT) has numerous applications in every domain, e.g., smart cities to provide intelligent services to sustainable cities. The next-generation of IoT networks is expected to be densely deployed in a resource-constrained and lossy environment. The densely deployed nodes producing radically heterogeneous traffic pattern causes congestion and collision in the network. At the medium access control (MAC) layer, mitigating channel collision is still one of the main challenges of future IoT networks. Similarly, the standardized network layer uses a ranking mechanism based on hop-counts and expected transmission counts (ETX), which often does not adapt to the dynamic and lossy environment and impact performance. The ranking mechanism also requires large control overheads to update rank information. The resource-constrained IoT devices operating in a low-power and lossy network (LLN) environment need an efficient solution to handle these problems. Reinforcement learning (RL) algorithms like Q-learning are recently utilized to solve learning problems in LLNs devices like sensors. Thus, in this paper, an RL-based optimization of dense LLN IoT devices with heavy heterogeneous traffic is devised. The proposed protocol learns the collision information from the MAC layer and makes an intelligent decision at the network layer. The proposed protocol also enhances the operation of the trickle timer algorithm. A Q-learning model is employed to adaptively learn the channel collision probability and network layer ranking states with accumulated reward function. Based on a simulation using Contiki 3.0 Cooja, the proposed intelligent scheme achieves a lower packet loss ratio, improves throughput, produces lower control overheads, and consumes less energy than other state-of-the-art mechanisms.Keywords
Cite This Article
Citations
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.