Journal article 704 views 40 downloads
Traffic signal control using reinforcement learning based on the teacher-student framework
Expert Systems with Applications, Volume: 228, Start page: 120458
Swansea University Author: Scott Yang
-
PDF | Accepted Manuscript
Download (669.31KB)
DOI (Published version): 10.1016/j.eswa.2023.120458
Abstract
Reinforcement Learning (RL) is an effective method for adaptive traffic signals control. As one type of RL, the teacher-student framework has been found helpful in improving the model performance for different application fields (such as robot control, game, hybrid intelligence), but it is rarely ap...
Published in: | Expert Systems with Applications |
---|---|
ISSN: | 0957-4174 |
Published: |
Elsevier BV
2023
|
Online Access: |
Check full text
|
URI: | https://cronfa.swan.ac.uk/Record/cronfa63462 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract: |
Reinforcement Learning (RL) is an effective method for adaptive traffic signals control. As one type of RL, the teacher-student framework has been found helpful in improving the model performance for different application fields (such as robot control, game, hybrid intelligence), but it is rarely applied for traffic control due to that the hyper-parameters and the number of state-action pairs experienced are difficult to determine. In this work, the teacher-student framework is used for traffic signal control, where only a single reward function is designed to guide the student agent and by using this method the number of hyper-parameters and the model complexity are reduced. Specifically, the teacher agent uses an importance function to evaluate and guide the student, where the importance function combines with environment reward to form a synthetic reward for the student agent. Experimental results under different traffic environments show that the proposed method achieves the expected performance enhancement and is better than most of the state-of-the-art RL-based traffic signal control methods. |
---|---|
College: |
Faculty of Science and Engineering |
Funders: |
This research is supported by the National Natural Science Foundation of China under Grant 61976063, the Guangxi Natural Science Foundation under Grant 2022GXNSFFA035028, research fund of Guangxi Normal University under Grant 2021JC006, the AI+Education research project of Guangxi Humanities Society Science Development Research Center under Grant ZXZJ202205. |
Start Page: |
120458 |