Multi-Agent Deep Reinforcement Learning with Clustering and Information Sharing for Traffic Light Cooperative Control

DU Tongchun; WANG Bo; CHENG Haoran; LUO Le; ZENG Nengmin

doi:10.11999/JEIT230857

Volume 46 Issue 2

Feb. 2024

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2024 > 46(2): 538-545

DU Tongchun, WANG Bo, CHENG Haoran, LUO Le, ZENG Nengmin. Multi-Agent Deep Reinforcement Learning with Clustering and Information Sharing for Traffic Light Cooperative Control[J]. Journal of Electronics & Information Technology, 2024, 46(2): 538-545. doi: 10.11999/JEIT230857

Citation:

DU Tongchun, WANG Bo, CHENG Haoran, LUO Le, ZENG Nengmin. Multi-Agent Deep Reinforcement Learning with Clustering and Information Sharing for Traffic Light Cooperative Control[J]. Journal of Electronics & Information Technology, 2024, 46(2): 538-545. doi: 10.11999/JEIT230857

Citation:

DU Tongchun, WANG Bo, CHENG Haoran, LUO Le, ZENG Nengmin. Multi-Agent Deep Reinforcement Learning with Clustering and Information Sharing for Traffic Light Cooperative Control[J]. Journal of Electronics & Information Technology, 2024, 46(2): 538-545. doi: 10.11999/JEIT230857

PDF( 3086 KB)

Multi-Agent Deep Reinforcement Learning with Clustering and Information Sharing for Traffic Light Cooperative Control

doi: 10.11999/JEIT230857 cstr: 32379.14.JEIT230857

1.
School of Computer and Information, Anhui Normal University, Wuhu 241008, China
2.
College of Economics and Management, Harbin Engineering University, Harbin 150001, China

Received Date: 2023-08-08
Rev Recd Date: 2023-12-04

Available Online: 2023-12-14

Publish Date: 2024-02-29

Abstract

Abstract

In order to improve the joint control effect of multi-crossing, Multi-Agent Deep Recurrent Q-Network (MADRQN) for real-time control of multi-intersection traffic signals is proposed in this paper. Firstly, the traffic light control is modeled as a Markov decision process, wherein one controller at each crossing is considered as an agent. Secondly, agents are clustered according to their position and observation. Then, information sharing and centralized training are conducted within each cluster. Also the value function network parameters of agents with the highest critic value are shared with other agent at the end of every training process. The simulated experimental results under Simulation of Urban MObility (SUMO) show that the proposed method can reduce the amount of communication data, make information sharing of agents and centralized training more feasible and efficient. The average delay of vehicles is reduced obviously compared with the state-of-the-art traffic light control methods based on multi-agent deep reinforcement learning. The proposed method can effectively alleviate traffic congestion.

FullText(HTML)

References(21)

References

[1]	PANDIT K, GHOSAL D, ZHANG H M, et al. Adaptive traffic signal control with vehicular ad hoc networks[J]. IEEE Transactions on Vehicular Technology, 2013, 62(4): 1459–1471. doi: 10.1109/TVT.2013.2241460.
[2]	HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778. doi: 10.1109/CVPR.2016.90.
[3]	邵明莉, 曹鹗, 胡铭, 等. 面向优先车辆感知的交通灯优化控制方法[J]. 软件学报, 2021, 32(8): 2425–2438. doi: 10.13328/j.cnki.jos.006191. SHAO Mingli, CAO E, HU Ming, et al. Traffic light optimization control method for priority vehicle awareness[J]. Journal of Software, 2021, 32(8): 2425–2438. doi: 10.13328/j.cnki.jos.006191.
[4]	HADDAD T A, HEDJAZI D, and AOUAG S. A new deep reinforcement learning-based adaptive traffic light control approach for isolated intersection[C]. The 5th International Symposium on Informatics and its Applications, M'sila, Algeria, 2022: 1–6. doi: 10.1109/ISIA55826.2022.9993598.
[5]	GENDERS W and RAZAVI S. Using a deep reinforcement learning agent for traffic signal control[J]. arXiv preprint arXiv: 1611.01142, 2016.
[6]	TIGGA A, HOTA L, PATEL S, et al. A deep Q-learning-based adaptive traffic light control system for urban safety[C]. The 4th International Conference on Advances in Computing, Communication Control and Networking, Greater Noida, India, 2022: 2430–2435. doi: 10.1109/ICAC3N56670.2022.10074123.
[7]	邹翔宇, 黄崇文, 徐勇军, 等. 基于深度学习的通信系统中安全能效的控制[J]. 电子与信息学报, 2022, 44(7): 2245–2252. doi: 10.11999/JEIT211611. ZOU Xiangyu, HUANG Chongwen, XU Yongjun, et al. Secure energy efficiency in communication systems based on deep learning[J]. Journal of Electronics & Information Technology, 2022, 44(7): 2245–2252. doi: 10.11999/JEIT 211611.
[8]	唐伦, 李质萱, 蒲昊, 等. 基于多智能体深度强化学习的无人机动态预部署策略[J]. 电子与信息学报, 2023, 45(6): 2007–2015. doi: 10.11999/JEIT220513. TANG Lun, LI Zhixuan, PU Hao, et al. A dynamic pre-deployment strategy of UAVs based on multi-agent deep reinforcement learning[J]. Journal of Electronics & Information Technology, 2023, 45(6): 2007–2015. doi: 10.11999/JEIT220513.
[9]	KANG Leilei, HUANG Hao, LU Weike, et al. A dueling deep Q-network method for low-carbon traffic signal control[J]. Applied Soft Computing, 2023, 141: 110304. doi: 10.1016/j.asoc.2023.110304.
[10]	TUNC I and SOYLEMEZ M T. Fuzzy logic and deep Q learning based control for traffic lights[J]. Alexandria Engineering Journal, 2023, 67: 343–359. doi: 10.1016/j.aej.2022.12.028.
[11]	BÁLINT K, TAMÁS T, and TAMÁS B. Deep reinforcement learning based approach for traffic signal control[J]. Transportation Research Procedia, 2022, 62: 278–285. doi: 10.1016/j.trpro.2022.02.035.
[12]	RASHID T, SAMVELYAN M, DE WITT C S, et al. QMIX: Monotonic value function factorisation for deep multi-agent reinforcement Learning[C]. The 35th International Conference on Machine Learning, Stockholm, Sweden, 2018: 6846–6859.
[13]	SON K, KIM D, KANG W J, et al. QTRAN: Learning to factorize with transformation for cooperative multi-agent reinforcement learning[C]. The 36th International Conference on Machine Learning, Long Beach, USA, 2019: 5887–5896.
[14]	LOWE R, WU Y I, TAMAR A, et al. Multi-agent actor-critic for mixed cooperative-competitive environments[C]. The 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 6382–6393.
[15]	FOERSTER J, FARQUHAR G, AFOURAS T, et al. Counterfactual multi-agent policy gradients[C]. The 32nd AAAI Conference on Artificial Intelligence, New Orleans, USA, 2018. doi: 10.1609/aaai.v32i1.11794.
[16]	SU Haoran, ZHONG Y D, DEY B, et al. EMVLight: A decentralized reinforcement learning framework for efficient passage of emergency vehicles[C]. The 36th AAAI Conference on Artificial Intelligence, 2021: 4593–4601. doi: 10.48550/arXiv.2109.05429.
[17]	YANG Shantian, YANG Bo, ZENG Zheng, et al. Causal inference multi-agent reinforcement learning for traffic signal control[J]. Information Fusion, 2023, 94: 243–256. doi: 10.1016/j.inffus.2023.02.009.
[18]	WANG Zixin, ZHU Hanyu, HE Mingcheng, et al. GAN and multi-agent DRL based decentralized traffic light signal control[J]. IEEE Transactions on Vehicular Technology, 2022, 71(2): 1333–1348. doi: 10.1109/TVT.2021.3134329.
[19]	丛珊. 基于多智能体强化学习的交通信号灯协同控制算法的研究[D]. [硕士论文], 南京信息工程大学, 2022. doi: 10.27248/d.cnki.gnjqc.2022.000386. CONG Shan. Multi-agent deep reinforcement learning based traffic light cooperative control[D]. [Master dissertation], Nanjing University of Information Science & Technology, 2022. doi: 10.27248/d.cnki.gnjqc.2022.000386.
[20]	ZHU Ruijie, LI Lulu, WU Shuning, et al. Multi-agent broad reinforcement learning for intelligent traffic light control[J]. Information Sciences, 2023, 619: 509–525. doi: 10.1016/j.ins.2022.11.062.
[21]	FRITZKE B. A growing neural gas network learns topologies[C]. The 7th International Conference on Neural Information Processing Systems, Denver, USA, 1994: 625–632.