Multiple intersections traffic signal control based on cooperative multi-agent reinforcement learning

Liu, Junxiu; Qin, Sheng; Su, Min; Luo, Yuling; Wang, Yanhu; Yang, Scott

doi:10.1016/j.ins.2023.119484

Journal article 1392 views 1250 downloads

Multiple intersections traffic signal control based on cooperative multi-agent reinforcement learning

Junxiu Liu, Sheng Qin, Min Su, Yuling Luo

, Yanhu Wang, Scott Yang

Information Sciences, Volume: 647, Start page: 119484

Swansea University Author: Scott Yang

PDF | Accepted Manuscript
Download (848.48KB)

Check full text

DOI (Published version): 10.1016/j.ins.2023.119484

Abstract

For the multi-agent traffic signal controls, the traffic signal at each intersection is controlled by an independent agent. Since the control policy for each agent is dynamic, when the traffic scale is large, the adjustment of the agent's policy brings non-stationary effects over surrounding in...

Full description

Published in:	Information Sciences
ISSN:	0020-0255
Published:	Elsevier BV 2023
Online Access:	Check full text
URI:	https://cronfa.swan.ac.uk/Record/cronfa64123

Abstract:	For the multi-agent traffic signal controls, the traffic signal at each intersection is controlled by an independent agent. Since the control policy for each agent is dynamic, when the traffic scale is large, the adjustment of the agent's policy brings non-stationary effects over surrounding intersections, leading to the instability of the overall system. Therefore, there is the necessity to eliminate this non-stationarity effect to stabilize the multi-agent system. A collaborative multi-agent reinforcement learning method is proposed in this work to enable the system to overcome the instability problem through a collaborative mechanism. Decentralized learning with limited communication is used to reduce the communication latency between agents. The Shapley value reward function is applied to comprehensively calculate the contribution of each agent to avoid the influence of reward function coefficient variation, thereby reducing unstable factors. The Kullback-Leibler divergence is then used to distinguish the current and historical policies, and the loss function is optimized to eliminate the environmental non-stationarity. Experimental results demonstrate that the average travel time and its standard deviation are reduced by using the Shapley value reward function and optimized loss function, respectively, and this work provides an alternative for traffic signal controls on multiple intersections.
Keywords:	Traffic signal control, Reinforcement learning, Multi-agent system
College:	Faculty of Science and Engineering
Funders:	This research is supported by the National Natural Science Foundation of China under Grant 61976063, the Guangxi Natural Science Foundation under Grant 2022GXNSFFA035028, research fund of Guangxi Normal University under Grant 2021JC006, the AI+Education research project of Guangxi Humanities Society Science Development Research Center under Grant ZXZJ202205.
Start Page:	119484

Multiple intersections traffic signal control based on cooperative multi-agent reinforcement learning

Similar Items