No Cover Image

Journal article 87 views 29 downloads

A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform / Mark, Jones

IEEE Transactions on Image Processing, Volume: 28, Issue: 11, Pages: 5322 - 5335

Swansea University Author: Mark, Jones

  • 2019_ParallelEDT.pdf

    PDF | Accepted Manuscript

    Download (2.96MB)
  • Author's Original under embargo until: 19th April 2020

Abstract

A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of n x n. Unlike existing PRAM and other algorithms, this algorithm is suitable for implementation on modern SIMD architectures such as GPUs. As...

Full description

Published in: IEEE Transactions on Image Processing
ISSN: 1057-7149 1941-0042
Published: 2019
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa50104
Tags: Add Tag
No Tags, Be the first to tag this record!
first_indexed 2019-05-09T20:00:51Z
last_indexed 2019-09-01T20:42:41Z
id cronfa50104
recordtype SURis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2019-09-01T18:36:00.2092860</datestamp><bib-version>v2</bib-version><id>50104</id><entry>2019-04-29</entry><title>A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform</title><swanseaauthors><author><sid>2e1030b6e14fc9debd5d5ae7cc335562</sid><ORCID>0000-0001-8991-1190</ORCID><firstname>Mark</firstname><surname>Jones</surname><name>Mark Jones</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2019-04-29</date><deptcode>SCS</deptcode><abstract>A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of n x n. Unlike existing PRAM and other algorithms, this algorithm is suitable for implementation on modern SIMD architectures such as GPUs. As a fundamental operation of 2D EDT, 1D EDT is efficiently parallelized first. Specifically, the GPU algorithm for the 1D EDT, which uses CUDA binary functions such as ballot(), ffs(), clz() and shfl(), runs in O(log_32n) time and performs O(n) work. Using the 1D EDT as a fundamental operation, the fully parallelized work-time optimal 2D EDT algorithm is designed. This algorithm consists of three steps. Step 1 of the algorithm runs in O(log_32n) time and performs O(N) (N=n^2) of total work on GPU. Step 2 performs O(N) of total work and has an expected time complexity of O(logn) on GPU. Step 3 runs in O(log_32n) time and performs O(N) of total work on GPU. As far as we know, this algorithm is the first fully-parallelized and realized work-time optimal algorithm for GPUs. Experimental results show that this algorithm outperforms prior state-of-the-art GPU algorithms.</abstract><type>Journal Article</type><journal>IEEE Transactions on Image Processing</journal><volume>28</volume><journalNumber>11</journalNumber><paginationStart>5322</paginationStart><paginationEnd>5335</paginationEnd><publisher/><issnPrint>1057-7149</issnPrint><issnElectronic>1941-0042</issnElectronic><keywords/><publishedDay>20</publishedDay><publishedMonth>5</publishedMonth><publishedYear>2019</publishedYear><publishedDate>2019-05-20</publishedDate><doi>10.1109/TIP.2019.2916741</doi><url/><notes/><college>COLLEGE NANME</college><department>Computer Science</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>SCS</DepartmentCode><institution>Swansea University</institution><lastEdited>2019-09-01T18:36:00.2092860</lastEdited><Created>2019-04-29T10:01:47.7060198</Created><path><level id="1">College of Science</level><level id="2">Computer Science</level></path><authors><author><firstname>Mark</firstname><surname>Jones</surname><orcid>0000-0001-8991-1190</orcid><order>1</order></author></authors><documents><document><filename>0050104-07052019142411.pdf</filename><originalFilename>2019_ParallelEDT.pdf</originalFilename><uploaded>2019-05-07T14:24:11.1300000</uploaded><type>Output</type><contentLength>3078717</contentLength><contentType>application/pdf</contentType><version>Accepted Manuscript</version><cronfaStatus>true</cronfaStatus><action/><embargoDate>2019-06-20T00:00:00.0000000</embargoDate><copyrightCorrect>true</copyrightCorrect><language>eng</language></document><document><filename>Under embargo</filename><originalFilename>Under embargo</originalFilename><uploaded>2019-05-03T12:19:28.9970000</uploaded><type>Output</type><contentLength>21649977</contentLength><contentType>application/pdf</contentType><version>Author's Original</version><cronfaStatus>true</cronfaStatus><action/><embargoDate>2020-04-19T00:00:00.0000000</embargoDate><copyrightCorrect>true</copyrightCorrect><language>eng</language></document></documents></rfc1807>
spelling 2019-09-01T18:36:00.2092860 v2 50104 2019-04-29 A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform 2e1030b6e14fc9debd5d5ae7cc335562 0000-0001-8991-1190 Mark Jones Mark Jones true false 2019-04-29 SCS A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of n x n. Unlike existing PRAM and other algorithms, this algorithm is suitable for implementation on modern SIMD architectures such as GPUs. As a fundamental operation of 2D EDT, 1D EDT is efficiently parallelized first. Specifically, the GPU algorithm for the 1D EDT, which uses CUDA binary functions such as ballot(), ffs(), clz() and shfl(), runs in O(log_32n) time and performs O(n) work. Using the 1D EDT as a fundamental operation, the fully parallelized work-time optimal 2D EDT algorithm is designed. This algorithm consists of three steps. Step 1 of the algorithm runs in O(log_32n) time and performs O(N) (N=n^2) of total work on GPU. Step 2 performs O(N) of total work and has an expected time complexity of O(logn) on GPU. Step 3 runs in O(log_32n) time and performs O(N) of total work on GPU. As far as we know, this algorithm is the first fully-parallelized and realized work-time optimal algorithm for GPUs. Experimental results show that this algorithm outperforms prior state-of-the-art GPU algorithms. Journal Article IEEE Transactions on Image Processing 28 11 5322 5335 1057-7149 1941-0042 20 5 2019 2019-05-20 10.1109/TIP.2019.2916741 COLLEGE NANME Computer Science COLLEGE CODE SCS Swansea University 2019-09-01T18:36:00.2092860 2019-04-29T10:01:47.7060198 College of Science Computer Science Mark Jones 0000-0001-8991-1190 1 0050104-07052019142411.pdf 2019_ParallelEDT.pdf 2019-05-07T14:24:11.1300000 Output 3078717 application/pdf Accepted Manuscript true 2019-06-20T00:00:00.0000000 true eng Under embargo Under embargo 2019-05-03T12:19:28.9970000 Output 21649977 application/pdf Author's Original true 2020-04-19T00:00:00.0000000 true eng
title A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform
spellingShingle A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform
Mark, Jones
title_short A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform
title_full A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform
title_fullStr A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform
title_full_unstemmed A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform
title_sort A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform
author_id_str_mv 2e1030b6e14fc9debd5d5ae7cc335562
author_id_fullname_str_mv 2e1030b6e14fc9debd5d5ae7cc335562_***_Mark, Jones
author Mark, Jones
format Journal article
container_title IEEE Transactions on Image Processing
container_volume 28
container_issue 11
container_start_page 5322
publishDate 2019
institution Swansea University
issn 1057-7149
1941-0042
doi_str_mv 10.1109/TIP.2019.2916741
college_str College of Science
hierarchytype
hierarchy_top_id collegeofscience
hierarchy_top_title College of Science
hierarchy_parent_id collegeofscience
hierarchy_parent_title College of Science
department_str Computer Science{{{_:::_}}}College of Science{{{_:::_}}}Computer Science
document_store_str 1
active_str 0
description A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of n x n. Unlike existing PRAM and other algorithms, this algorithm is suitable for implementation on modern SIMD architectures such as GPUs. As a fundamental operation of 2D EDT, 1D EDT is efficiently parallelized first. Specifically, the GPU algorithm for the 1D EDT, which uses CUDA binary functions such as ballot(), ffs(), clz() and shfl(), runs in O(log_32n) time and performs O(n) work. Using the 1D EDT as a fundamental operation, the fully parallelized work-time optimal 2D EDT algorithm is designed. This algorithm consists of three steps. Step 1 of the algorithm runs in O(log_32n) time and performs O(N) (N=n^2) of total work on GPU. Step 2 performs O(N) of total work and has an expected time complexity of O(logn) on GPU. Step 3 runs in O(log_32n) time and performs O(N) of total work on GPU. As far as we know, this algorithm is the first fully-parallelized and realized work-time optimal algorithm for GPUs. Experimental results show that this algorithm outperforms prior state-of-the-art GPU algorithms.
published_date 2019-05-20T04:10:13Z
_version_ 1649967694698512384
score 10.866512