Predator-Prey Q-Learning Based Collaborative Coverage Path Planning for Swarm Robotics

Watson, Michael; Ren, Hans; Arvin, Farshad; Hu, Junyan

doi:https://doi.org/

Conference Paper/Proceeding/Abstract 221 views

Predator-Prey Q-Learning Based Collaborative Coverage Path Planning for Swarm Robotics

Michael Watson, Hans Ren, Farshad Arvin, Junyan Hu

2024 Annual Conference Towards Autonomous Robotic Systems (TAROS)

Swansea University Author: Hans Ren

PDF | Accepted Manuscript

Author accepted manuscript document released under the terms of a Creative Commons CC-BY licence using the Swansea University Research Publications Policy (rights retention).
Download (627KB)

Check full text

Abstract

Coverage Path Planning (CPP) is an effective approach to let intelligent robots cover an area by finding feasible paths through the environment. In this paper, we focus on using reinforcement learning to learn about a given environment and find the most efficient path that explores all target points...

Full description

Published in:	2024 Annual Conference Towards Autonomous Robotic Systems (TAROS)
ISSN:	0302-9743
Published:	Springer 2024
Online Access:	Check full text
URI:	https://cronfa.swan.ac.uk/Record/cronfa66908
Tags:	Add Tag No Tags, Be the first to tag this record!

first_indexed	2024-06-28T18:29:17Z
last_indexed	2024-06-28T18:29:17Z
id	cronfa66908
recordtype	SURis
fullrecord	<?xml version="1.0" encoding="utf-8"?><rfc1807 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><bib-version>v2</bib-version><id>66908</id><entry>2024-06-28</entry><title>Predator-Prey Q-Learning Based Collaborative Coverage Path Planning for Swarm Robotics</title><swanseaauthors><author><sid>9e043b899a2b786672a28ed4f864ffcc</sid><firstname>Hans</firstname><surname>Ren</surname><name>Hans Ren</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2024-06-28</date><deptcode>MACS</deptcode><abstract>Coverage Path Planning (CPP) is an effective approach to let intelligent robots cover an area by finding feasible paths through the environment. In this paper, we focus on using reinforcement learning to learn about a given environment and find the most efficient path that explores all target points. To overcome the limitations caused by stan- dard Q-learning based CPP that often fall into a local optimum and may be in-efficient in large-scale environments, two methods of improve- ment are considered, i.e., the use of a robot swarm working towards the same goal and the augmenting of the Q-learning algorithm to include a predator-prey based reward system. Existing predator-prey based reward systems provide rewards the further away an agent is from its predator, the paper adapts this concept to work within a robot swarm by simulat- ing each agent of the swarm as both predator and prey. Simulation case studies and comparisons with the standard Q-learning show that the proposed method has a superior coverage performance in complicated environments.</abstract><type>Conference Paper/Proceeding/Abstract</type><journal>2024 Annual Conference Towards Autonomous Robotic Systems (TAROS)</journal><volume/><journalNumber/><paginationStart/><paginationEnd/><publisher>Springer</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic>0302-9743</issnElectronic><keywords/><publishedDay>18</publishedDay><publishedMonth>7</publishedMonth><publishedYear>2024</publishedYear><publishedDate>2024-07-18</publishedDate><doi/><url>https://link.springer.com/conference/taros</url><notes/><college>COLLEGE NANME</college><department>Mathematics and Computer Science School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MACS</DepartmentCode><institution>Swansea University</institution><apcterm/><funders/><projectreference/><lastEdited>2024-11-04T16:23:02.4787826</lastEdited><Created>2024-06-28T19:24:58.2731592</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Michael</firstname><surname>Watson</surname><order>1</order></author><author><firstname>Hans</firstname><surname>Ren</surname><order>2</order></author><author><firstname>Farshad</firstname><surname>Arvin</surname><order>3</order></author><author><firstname>Junyan</firstname><surname>Hu</surname><order>4</order></author></authors><documents><document><filename>66908__30779__f837f9af2d2545a0b33f45fa48fa9072.pdf</filename><originalFilename>Michael_TAROS.pdf</originalFilename><uploaded>2024-06-28T19:28:46.8381822</uploaded><type>Output</type><contentLength>642046</contentLength><contentType>application/pdf</contentType><version>Accepted Manuscript</version><cronfaStatus>true</cronfaStatus><embargoDate>2024-07-28T00:00:00.0000000</embargoDate><documentNotes>Author accepted manuscript document released under the terms of a Creative Commons CC-BY licence using the Swansea University Research Publications Policy (rights retention).</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>https://creativecommons.org/licenses/by/4.0/deed.en</licence></document></documents><OutputDurs/></rfc1807>
spelling	v2 66908 2024-06-28 Predator-Prey Q-Learning Based Collaborative Coverage Path Planning for Swarm Robotics 9e043b899a2b786672a28ed4f864ffcc Hans Ren Hans Ren true false 2024-06-28 MACS Coverage Path Planning (CPP) is an effective approach to let intelligent robots cover an area by finding feasible paths through the environment. In this paper, we focus on using reinforcement learning to learn about a given environment and find the most efficient path that explores all target points. To overcome the limitations caused by stan- dard Q-learning based CPP that often fall into a local optimum and may be in-efficient in large-scale environments, two methods of improve- ment are considered, i.e., the use of a robot swarm working towards the same goal and the augmenting of the Q-learning algorithm to include a predator-prey based reward system. Existing predator-prey based reward systems provide rewards the further away an agent is from its predator, the paper adapts this concept to work within a robot swarm by simulat- ing each agent of the swarm as both predator and prey. Simulation case studies and comparisons with the standard Q-learning show that the proposed method has a superior coverage performance in complicated environments. Conference Paper/Proceeding/Abstract 2024 Annual Conference Towards Autonomous Robotic Systems (TAROS) Springer 0302-9743 18 7 2024 2024-07-18 https://link.springer.com/conference/taros COLLEGE NANME Mathematics and Computer Science School COLLEGE CODE MACS Swansea University 2024-11-04T16:23:02.4787826 2024-06-28T19:24:58.2731592 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Michael Watson 1 Hans Ren 2 Farshad Arvin 3 Junyan Hu 4 66908__30779__f837f9af2d2545a0b33f45fa48fa9072.pdf Michael_TAROS.pdf 2024-06-28T19:28:46.8381822 Output 642046 application/pdf Accepted Manuscript true 2024-07-28T00:00:00.0000000 Author accepted manuscript document released under the terms of a Creative Commons CC-BY licence using the Swansea University Research Publications Policy (rights retention). true eng https://creativecommons.org/licenses/by/4.0/deed.en
title	Predator-Prey Q-Learning Based Collaborative Coverage Path Planning for Swarm Robotics
spellingShingle	Predator-Prey Q-Learning Based Collaborative Coverage Path Planning for Swarm Robotics Hans Ren
title_short	Predator-Prey Q-Learning Based Collaborative Coverage Path Planning for Swarm Robotics
title_full	Predator-Prey Q-Learning Based Collaborative Coverage Path Planning for Swarm Robotics
title_fullStr	Predator-Prey Q-Learning Based Collaborative Coverage Path Planning for Swarm Robotics
title_full_unstemmed	Predator-Prey Q-Learning Based Collaborative Coverage Path Planning for Swarm Robotics
title_sort	Predator-Prey Q-Learning Based Collaborative Coverage Path Planning for Swarm Robotics
author_id_str_mv	9e043b899a2b786672a28ed4f864ffcc
author_id_fullname_str_mv	9e043b899a2b786672a28ed4f864ffcc_***_Hans Ren
author	Hans Ren
author2	Michael Watson Hans Ren Farshad Arvin Junyan Hu
format	Conference Paper/Proceeding/Abstract
container_title	2024 Annual Conference Towards Autonomous Robotic Systems (TAROS)
publishDate	2024
institution	Swansea University
issn	0302-9743
publisher	Springer
college_str	Faculty of Science and Engineering
hierarchytype
hierarchy_top_id	facultyofscienceandengineering
hierarchy_top_title	Faculty of Science and Engineering
hierarchy_parent_id	facultyofscienceandengineering
hierarchy_parent_title	Faculty of Science and Engineering
department_str	School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science
url	https://link.springer.com/conference/taros
document_store_str	1
active_str	0
description	Coverage Path Planning (CPP) is an effective approach to let intelligent robots cover an area by finding feasible paths through the environment. In this paper, we focus on using reinforcement learning to learn about a given environment and find the most efficient path that explores all target points. To overcome the limitations caused by stan- dard Q-learning based CPP that often fall into a local optimum and may be in-efficient in large-scale environments, two methods of improve- ment are considered, i.e., the use of a robot swarm working towards the same goal and the augmenting of the Q-learning algorithm to include a predator-prey based reward system. Existing predator-prey based reward systems provide rewards the further away an agent is from its predator, the paper adapts this concept to work within a robot swarm by simulat- ing each agent of the swarm as both predator and prey. Simulation case studies and comparisons with the standard Q-learning show that the proposed method has a superior coverage performance in complicated environments.
published_date	2024-07-18T16:23:00Z
_version_	1814809679582199808
score	11.035634

Predator-Prey Q-Learning Based Collaborative Coverage Path Planning for Swarm Robotics

Similar Items