No Cover Image

Journal article 60 views 12 downloads

A work efficient parallel algorithm for exact Euclidean Distance Transform / Manduhu; Mark W. Jones

IEEE Transactions on Image Processing, Pages: 1 - 1

Swansea University Author: Jones, Mark

  • 2019_ParallelEDT.pdf

    PDF | Accepted Manuscript

    Download (2.96MB)
  • Author's Original under embargo until: 19th April 2020

Abstract

A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of n x n. Unlike existing PRAM and other algorithms, this algorithm is suitable for implementation on modern SIMD architectures such as GPUs. As...

Full description

Published in: IEEE Transactions on Image Processing
ISSN: 1057-7149 1941-0042
Published: 2019
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa50104
Tags: Add Tag
No Tags, Be the first to tag this record!
first_indexed 2019-05-09T20:00:51Z
last_indexed 2019-07-18T21:34:26Z
id cronfa50104
recordtype SURis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2019-07-18T14:52:42Z</datestamp><bib-version>v2</bib-version><id>50104</id><entry>2019-04-29</entry><title>A work efficient parallel algorithm for exact Euclidean Distance Transform</title><alternativeTitle></alternativeTitle><author>Mark Jones</author><firstname>Mark</firstname><surname>Jones</surname><active>true</active><ORCID>0000-0001-8991-1190</ORCID><ethesisStudent>false</ethesisStudent><sid>2e1030b6e14fc9debd5d5ae7cc335562</sid><email>dda0c29127c698255a4c2b822dd94125</email><emailaddr>uiPdnV+XNibOpUxFjI3lXQgr5y2nBRz3haj4DmVVDsQ=</emailaddr><date>2019-04-29</date><deptcode>SCS</deptcode><abstract>A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of n x n. Unlike existing PRAM and other algorithms, this algorithm is suitable for implementation on modern SIMD architectures such as GPUs. As a fundamental operation of 2D EDT, 1D EDT is efficiently parallelized first. Specifically, the GPU algorithm for the 1D EDT, which uses CUDA binary functions such as ballot(), ffs(), clz() and shfl(), runs in O(log_32n) time and performs O(n) work. Using the 1D EDT as a fundamental operation, the fully parallelized work-time optimal 2D EDT algorithm is designed. This algorithm consists of three steps. Step 1 of the algorithm runs in O(log_32n) time and performs O(N) (N=n^2) of total work on GPU. Step 2 performs O(N) of total work and has an expected time complexity of O(logn) on GPU. Step 3 runs in O(log_32n) time and performs O(N) of total work on GPU. As far as we know, this algorithm is the first fully-parallelized and realized work-time optimal algorithm for GPUs. Experimental results show that this algorithm outperforms prior state-of-the-art GPU algorithms.</abstract><type>Journal article</type><journal>IEEE Transactions on Image Processing</journal><volume/><journalNumber/><paginationStart>1</paginationStart><paginationEnd>1</paginationEnd><publisher></publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint>1057-7149</issnPrint><issnElectronic>1941-0042</issnElectronic><keywords></keywords><publishedDay>0</publishedDay><publishedMonth>0</publishedMonth><publishedYear>2019</publishedYear><publishedDate>2019-01-01</publishedDate><doi>10.1109/TIP.2019.2916741</doi><url></url><notes></notes><college>College of Science</college><department>Computer Science</department><CollegeCode>CSCI</CollegeCode><DepartmentCode>SCS</DepartmentCode><institution/><researchGroup>Visual Computing</researchGroup><supervisor/><sponsorsfunders/><grantnumber/><degreelevel/><degreename>None</degreename><lastEdited>2019-07-18T14:52:42Z</lastEdited><Created>2019-04-29T10:01:47Z</Created><path><level id="1">College of Science</level><level id="2">Computer Science</level></path><authors><author><firstname></firstname><surname>Manduhu</surname><orcid/><order>1</order></author><author><firstname>Mark W.</firstname><surname>Jones</surname><orcid/><order>2</order></author></authors><documents><document><filename>Under embargo</filename><originalFilename>Under embargo</originalFilename><uploaded>2019-05-03T12:19:28Z</uploaded><type>Output</type><contentLength>21649977</contentLength><contentType>application/pdf</contentType><version>AO</version><cronfaStatus>true</cronfaStatus><action>Updated Copyright</action><actionDate>18/07/2019</actionDate><embargoDate>2020-04-19T00:00:00</embargoDate><documentNotes/><copyrightCorrect>true</copyrightCorrect><language>eng</language></document><document><filename>0050104-07052019142411.pdf</filename><originalFilename>2019_ParallelEDT.pdf</originalFilename><uploaded>2019-05-07T14:24:11Z</uploaded><type>Output</type><contentLength>3078717</contentLength><contentType>application/pdf</contentType><version>AM</version><cronfaStatus>true</cronfaStatus><action>Updated Embargo</action><actionDate>21/06/2019</actionDate><embargoDate>2019-06-20T00:00:00</embargoDate><documentNotes/><copyrightCorrect>true</copyrightCorrect><language>eng</language></document></documents></rfc1807>
spelling 2019-07-18T14:52:42Z v2 50104 2019-04-29 A work efficient parallel algorithm for exact Euclidean Distance Transform Mark Jones Mark Jones true 0000-0001-8991-1190 false 2e1030b6e14fc9debd5d5ae7cc335562 dda0c29127c698255a4c2b822dd94125 uiPdnV+XNibOpUxFjI3lXQgr5y2nBRz3haj4DmVVDsQ= 2019-04-29 SCS A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of n x n. Unlike existing PRAM and other algorithms, this algorithm is suitable for implementation on modern SIMD architectures such as GPUs. As a fundamental operation of 2D EDT, 1D EDT is efficiently parallelized first. Specifically, the GPU algorithm for the 1D EDT, which uses CUDA binary functions such as ballot(), ffs(), clz() and shfl(), runs in O(log_32n) time and performs O(n) work. Using the 1D EDT as a fundamental operation, the fully parallelized work-time optimal 2D EDT algorithm is designed. This algorithm consists of three steps. Step 1 of the algorithm runs in O(log_32n) time and performs O(N) (N=n^2) of total work on GPU. Step 2 performs O(N) of total work and has an expected time complexity of O(logn) on GPU. Step 3 runs in O(log_32n) time and performs O(N) of total work on GPU. As far as we know, this algorithm is the first fully-parallelized and realized work-time optimal algorithm for GPUs. Experimental results show that this algorithm outperforms prior state-of-the-art GPU algorithms. Journal article IEEE Transactions on Image Processing 1 1 1057-7149 1941-0042 0 0 2019 2019-01-01 10.1109/TIP.2019.2916741 College of Science Computer Science CSCI SCS Visual Computing None 2019-07-18T14:52:42Z 2019-04-29T10:01:47Z College of Science Computer Science Manduhu 1 Mark W. Jones 2 Under embargo Under embargo 2019-05-03T12:19:28Z Output 21649977 application/pdf AO true Updated Copyright 18/07/2019 2020-04-19T00:00:00 true eng 0050104-07052019142411.pdf 2019_ParallelEDT.pdf 2019-05-07T14:24:11Z Output 3078717 application/pdf AM true Updated Embargo 21/06/2019 2019-06-20T00:00:00 true eng
title A work efficient parallel algorithm for exact Euclidean Distance Transform
spellingShingle A work efficient parallel algorithm for exact Euclidean Distance Transform
Jones, Mark
title_short A work efficient parallel algorithm for exact Euclidean Distance Transform
title_full A work efficient parallel algorithm for exact Euclidean Distance Transform
title_fullStr A work efficient parallel algorithm for exact Euclidean Distance Transform
title_full_unstemmed A work efficient parallel algorithm for exact Euclidean Distance Transform
title_sort A work efficient parallel algorithm for exact Euclidean Distance Transform
author_id_str_mv 2e1030b6e14fc9debd5d5ae7cc335562
author_id_fullname_str_mv 2e1030b6e14fc9debd5d5ae7cc335562_***_Jones, Mark
author Jones, Mark
author2 Manduhu
Mark W. Jones
format Journal article
container_title IEEE Transactions on Image Processing
container_start_page 1
publishDate 2019
institution Swansea University
issn 1057-7149
1941-0042
doi_str_mv 10.1109/TIP.2019.2916741
college_str College of Science
hierarchytype
hierarchy_top_id collegeofscience
hierarchy_top_title College of Science
hierarchy_parent_id collegeofscience
hierarchy_parent_title College of Science
department_str Computer Science{{{_:::_}}}College of Science{{{_:::_}}}Computer Science
document_store_str 1
active_str 1
researchgroup_str Visual Computing
description A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of n x n. Unlike existing PRAM and other algorithms, this algorithm is suitable for implementation on modern SIMD architectures such as GPUs. As a fundamental operation of 2D EDT, 1D EDT is efficiently parallelized first. Specifically, the GPU algorithm for the 1D EDT, which uses CUDA binary functions such as ballot(), ffs(), clz() and shfl(), runs in O(log_32n) time and performs O(n) work. Using the 1D EDT as a fundamental operation, the fully parallelized work-time optimal 2D EDT algorithm is designed. This algorithm consists of three steps. Step 1 of the algorithm runs in O(log_32n) time and performs O(N) (N=n^2) of total work on GPU. Step 2 performs O(N) of total work and has an expected time complexity of O(logn) on GPU. Step 3 runs in O(log_32n) time and performs O(N) of total work on GPU. As far as we know, this algorithm is the first fully-parallelized and realized work-time optimal algorithm for GPUs. Experimental results show that this algorithm outperforms prior state-of-the-art GPU algorithms.
published_date 2019-01-01T22:25:08Z
_version_ 1642875645680680960
score 10.837587