Joint Routing and Resource Scheduling Algorithm for Large-scale Multi-mode Mesh Networks Based on Reinforcement Learning

ZHU Xiaorong; HE Chuhong

doi:10.11999/JEIT231103

Volume 46 Issue 7

Jul. 2024

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2024 > 46(7): 2773-2782

Huang Huixian, Shi Zhongke. A HYBRID ALGORITHM FOR FINDING GLOBAL OPTIMUM WITH GENETIC ALGORITHM[J]. Journal of Electronics & Information Technology, 2001, 23(9): 875-878.

Citation:

ZHU Xiaorong, HE Chuhong. Joint Routing and Resource Scheduling Algorithm for Large-scale Multi-mode Mesh Networks Based on Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2024, 46(7): 2773-2782. doi: 10.11999/JEIT231103

Huang Huixian, Shi Zhongke. A HYBRID ALGORITHM FOR FINDING GLOBAL OPTIMUM WITH GENETIC ALGORITHM[J]. Journal of Electronics & Information Technology, 2001, 23(9): 875-878.

Citation:

PDF( 3887 KB)

Joint Routing and Resource Scheduling Algorithm for Large-scale Multi-mode Mesh Networks Based on Reinforcement Learning

doi: 10.11999/JEIT231103

ZHU Xiaorong^,,
HE Chuhong

College of Telecommunication and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

Funds: The National Natural Science Foundation of China (92367102, 92067101), The Key R&D Plan of Jiangsu Province (BE2021013-3)

Received Date: 2023-10-10
Rev Recd Date: 2024-02-04

Available Online: 2024-02-26

Publish Date: 2024-07-29

Abstract

Abstract

In order to balance the transmission reliability and efficiency of large-scale multi-mode mesh networks in the new power system, a two-stage algorithm is proposed based on reinforcement learning for joint routing selection and resource scheduling in large-scale multi-mode mesh networks, building upon the description and analysis of optimization problems. In the first stage, based on the network topology information and service requirements, a multi shortest path routing algorithm is utilized to generate all the shortest paths. In the second stage, a resource scheduling algorithm based on Multi-Armed Bandit (MAB) is proposed. The algorithm constructs the arms of the MAB based on the obtained set of shortest paths, then calculates the reward according to the service demands, and finally gives the optimal route selection and resource scheduling mode for service transmission. Simulation results show that the proposed algorithm can meet different service transmission requirements, and achieve an efficient balance between the average end-to-end path delay and the average transmission success rate.
- Mesh networks,
- Routing selection,
- Resource scheduling,
- Multi-Armed Bandit (MAB),
- Reinforcement learning

FullText(HTML)

References(20)

References

[1]	BEDI G, VENAYAGAMOORTHY G K, SINGH R, et al. Review of Internet of Things (IoT) in electric power and energy systems[J]. IEEE Internet of Things Journal, 2018, 5(2): 847–870. doi: 10.1109/JIOT.2018.2802704.
[2]	TALEB S M, MERAIHI Y, GABIS A B, et al. Nodes placement in wireless mesh networks using optimization approaches: A survey[J]. Neural Computing and Applications, 2022, 34(7): 5283–5319. doi: 10.1007/s00521–022-06941-y.
[3]	ALOTAIBI E and MUKHERJEE B. A survey on routing algorithms for wireless ad-hoc and mesh networks[J]. Computer Networks, 2012, 56(2): 940–965. doi: 10.1016/j.comnet.2011.10.011.
[4]	WANG Lei, ZHANG Lianfang, SHU Yantai, et al. Multipath source routing in wireless ad hoc networks[C]. 2000 Canadian Conference on Electrical and Computer Engineering. Conference Proceedings. Navigating to a New Era (Cat. No. 00TH8492), Halifax, Canada, 2000: 479–483. doi: 10.1109/CCECE.2000.849755.
[5]	GUO Xiaoyuan, WANG Feng, LIU Jiangchuan, et al. Path diversified multi-QoS optimization in multi-channel wireless mesh networks[J]. Wireless Networks, 2014, 20(6): 1583–1596. doi: 10.1007/s11276-014-0698-x.
[6]	JIA Dongyao, ZOU Shengxiong, LI Meng, et al. Adaptive multi-path routing based on an improved leapfrog algorithm[J]. Information Sciences, 2016, 367/368: 615–629. doi: 10.1016/j.ins.2016.07.021.
[7]	SUN Yaohua, PENG Mugen, ZHOU Yangcheng, et al. Application of machine learning in wireless networks: Key techniques and open issues[J]. IEEE Communications Surveys & Tutorials, 2019, 21(4): 3072–3108. doi: 10.1109/COMST.2019.2924243.
[8]	DI VALERIO V, LO PRESTI F, PETRIOLI C, et al. CARMA: Channel-aware reinforcement learning-based multi-path adaptive routing for underwater wireless sensor networks[J]. IEEE Journal on Selected Areas in Communications, 2019, 37(11): 2634–2647. doi: 10.1109/JSAC.2019.2933968.
[9]	LIU Qingzhi, CHENG Long, JIA A L, et al. Deep reinforcement learning for communication flow control in wireless mesh networks[J]. IEEE Network, 2021, 35(2): 112–119. doi: 10.1109/MNET.011.2000303.
[10]	NG P C and SHE J. Remote proximity sensing with a novel Q-learning in Bluetooth low energy network[J]. IEEE Transactions on Wireless Communications, 2022, 21(8): 6156–6166. doi: 10.1109/TWC.2022.3147411.
[11]	WANG Jinxin, ZHANG Fan, XIE Zhonglin, et al. Joint bandwidth allocation and path selection in WANs with path cardinality constraints[J]. Journal of Communications and Information Networks, 2021, 6(3): 237–250. doi: 10.23919/JCIN.2021.9549120.
[12]	APPINI N R and REDDY A R. Joint channel assignment and bandwidth reservation using Improved FireFly Algorithm (IFA) in Wireless Mesh Networks (WMN)[J]. Wireless Personal Communications, 2023, 131(1): 455–470. doi: 10.1007/s11277-023-10439-8.
[13]	BINH L H and DUONG T V T. Load balancing routing under constraints of quality of transmission in mesh wireless network based on software defined networking[J]. Journal of Communications and Networks, 2021, 23(1): 12–22. doi: 10.23919/JCN.2021.000004.
[14]	KUMAR R, VENKANNA U, and TIWARI V. Opt-ACM: An optimized load balancing based admission control mechanism for software defined hybrid wireless based IoT (SDHW-IoT) network[J]. Computer Networks, 2021, 188: 107888. doi: 10.1016/j.comnet.2021.107888.
[15]	ALHARBI N, MACKENZIE L, and PEZAROS D. Enhancing graph routing algorithm of industrial wireless sensor networks using the covariance-matrix adaptation evolution strategy[J]. Sensors, 2022, 22(19): 7462. doi: 10.3390/s22197462.
[16]	BAROLLI A, BYLYKBASHI K, QAFZEZI E, et al. A comparison study of Weibull, normal and Boulevard distributions for wireless mesh networks considering different router replacement methods by a hybrid intelligent simulation system[J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14(8): 10181–10194. doi: 10.1007/s12652-021-03680-1.
[17]	ROZHOŇ V, HAEUPLER B, MARTINSSON A, et al. Parallel breadth-first search and exact shortest paths and stronger notions for approximate distances[C]. Proceedings of the 55th Annual ACM Symposium on Theory of Computing, Orlando, USA, 2023: 321–334. doi: 10.1145/3564246.3585235.
[18]	SILVA N, WERNECK H, SILVA T, et al. Multi-armed bandits in recommendation systems: A survey of the state-of-the-art and future directions[J]. Expert Systems with Applications, 2022, 197: 116669. doi: 10.1016/j.eswa.2022.116669.
[19]	LEE S, YU H, and LEE H. Multiagent Q-learning-based multi-UAV wireless networks for maximizing energy efficiency: Deployment and power control strategy design[J]. IEEE Internet of Things Journal, 2022, 9(9): 6434–6442. doi: 10.1109/JIOT.2021.3113128.
[20]	ZAATOURI I, ALYAOUI N, GUILOUFI A B, et al. Design and performance analysis of objective functions for RPL routing protocol[J]. Wireless Personal Communications, 2022, 124(3): 2677–2697. doi: 10.1007/s11277-022-09484-6.