Guided latent diffusion for universal medical image segmentation

Salsi, Mattia; Wang, Yunying; Hu, Chen; Hu, Yueyue; Ren, Hanchi; Deng, Jingjing; Xie, Xianghua

doi:10.1117/12.3065214

Conference Paper/Proceeding/Abstract 835 views 353 downloads

Guided latent diffusion for universal medical image segmentation

Mattia Salsi, Yunying Wang, Chen Hu, Yueyue Hu, Hanchi Ren, Jingjing Deng, Xianghua Xie

International Conference on AI-Generated Content (AIGC 2024), Volume: 13649, Start page: 7

Swansea University Authors: Chen Hu, Hanchi Ren, Xianghua Xie

PDF | Accepted Manuscript

Copyright 2025. Society of Photo‑Optical Instrumentation Engineers (SPIE). One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this publication for a fee or for commercial purposes, and modification of the contents of the publication are prohibited.
Download (1.35MB)

Check full text

DOI (Published version): 10.1117/12.3065214

Abstract

Deep learning based medical segmentation still presents a great challenge due to the lack of large-scale datasets in the medical domain. The existing publicly available datasets vary significantly in terms of imaging modalities and target anatomies. This paper presents a novel guided latent diffusio...

Full description

Published in:	International Conference on AI-Generated Content (AIGC 2024)
ISBN:	9781510692114 9781510692121
ISSN:	0277-786X 1996-756X
Published:	SPIE 2025
Online Access:	Check full text
URI:	https://cronfa.swan.ac.uk/Record/cronfa68384

first_indexed	2024-11-29T13:46:46Z
last_indexed	2025-07-26T01:39:57Z
id	cronfa68384
recordtype	SURis
fullrecord	<?xml version="1.0"?><rfc1807><datestamp>2025-07-24T13:02:21.5023347</datestamp><bib-version>v2</bib-version><id>68384</id><entry>2024-11-29</entry><title>Guided latent diffusion for universal medical image segmentation</title><swanseaauthors><author><sid>55d3ba5f8378c2e3439d7e3962aee726</sid><firstname>Chen</firstname><surname>Hu</surname><name>Chen Hu</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>9e043b899a2b786672a28ed4f864ffcc</sid><firstname>Hanchi</firstname><surname>Ren</surname><name>Hanchi Ren</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>b334d40963c7a2f435f06d2c26c74e11</sid><ORCID>0000-0002-2701-8660</ORCID><firstname>Xianghua</firstname><surname>Xie</surname><name>Xianghua Xie</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2024-11-29</date><deptcode>MACS</deptcode><abstract>Deep learning based medical segmentation still presents a great challenge due to the lack of large-scale datasets in the medical domain. The existing publicly available datasets vary significantly in terms of imaging modalities and target anatomies. This paper presents a novel guided latent diffusion model for universal medical segmentation, capable of segmenting diverse anatomical structures using a single and unified architecture. Given a Contrastive Language-Image Pretraining (CLIP) embedding specifying the target anatomical structure, the proposed model leverages a collection of datasets covering the diverse structures which can segment any anatomical targets that are presented in the aggregated data. By performing diffusion fully in latent space, we achieve comparable results to pixel-space diffusion with significantly lower computational cost. The proposed methods demonstrates competitive performance against existing deep learning-based discriminative approaches on several benchmarks. Furthermore, it shows strong generalization capabilities on unseen datasets.</abstract><type>Conference Paper/Proceeding/Abstract</type><journal>International Conference on AI-Generated Content (AIGC 2024)</journal><volume>13649</volume><journalNumber/><paginationStart>7</paginationStart><paginationEnd/><publisher>SPIE</publisher><placeOfPublication/><isbnPrint>9781510692114</isbnPrint><isbnElectronic>9781510692121</isbnElectronic><issnPrint>0277-786X</issnPrint><issnElectronic>1996-756X</issnElectronic><keywords>Image segmentation; Data modeling; Diffusion; Performance modeling; Education and training; Anatomy; Visual process modeling; 3D modeling; Medical imaging; Denoising</keywords><publishedDay>7</publishedDay><publishedMonth>7</publishedMonth><publishedYear>2025</publishedYear><publishedDate>2025-07-07</publishedDate><doi>10.1117/12.3065214</doi><url/><notes/><college>COLLEGE NANME</college><department>Mathematics and Computer Science School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MACS</DepartmentCode><institution>Swansea University</institution><apcterm>Not Required</apcterm><funders/><projectreference/><lastEdited>2025-07-24T13:02:21.5023347</lastEdited><Created>2024-11-29T11:08:49.4467788</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Mattia</firstname><surname>Salsi</surname><order>1</order></author><author><firstname>Yunying</firstname><surname>Wang</surname><order>2</order></author><author><firstname>Chen</firstname><surname>Hu</surname><order>3</order></author><author><firstname>Yueyue</firstname><surname>Hu</surname><order>4</order></author><author><firstname>Hanchi</firstname><surname>Ren</surname><order>5</order></author><author><firstname>Jingjing</firstname><surname>Deng</surname><order>6</order></author><author><firstname>Xianghua</firstname><surname>Xie</surname><orcid>0000-0002-2701-8660</orcid><order>7</order></author></authors><documents><document><filename>68384__32999__96f9e975c5f640d1a5ee3be1ec3eda8d.pdf</filename><originalFilename>Diffusion_Segmentation.pdf</originalFilename><uploaded>2024-11-29T11:13:45.7541122</uploaded><type>Output</type><contentLength>1417819</contentLength><contentType>application/pdf</contentType><version>Accepted Manuscript</version><cronfaStatus>true</cronfaStatus><documentNotes>Copyright 2025. Society of Photo‑Optical Instrumentation Engineers (SPIE). One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this publication for a fee or for commercial purposes, and modification of the contents of the publication are prohibited.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>https://creativecommons.org/licenses/by/4.0/deed.en</licence></document></documents><OutputDurs/></rfc1807>
spelling	2025-07-24T13:02:21.5023347 v2 68384 2024-11-29 Guided latent diffusion for universal medical image segmentation 55d3ba5f8378c2e3439d7e3962aee726 Chen Hu Chen Hu true false 9e043b899a2b786672a28ed4f864ffcc Hanchi Ren Hanchi Ren true false b334d40963c7a2f435f06d2c26c74e11 0000-0002-2701-8660 Xianghua Xie Xianghua Xie true false 2024-11-29 MACS Deep learning based medical segmentation still presents a great challenge due to the lack of large-scale datasets in the medical domain. The existing publicly available datasets vary significantly in terms of imaging modalities and target anatomies. This paper presents a novel guided latent diffusion model for universal medical segmentation, capable of segmenting diverse anatomical structures using a single and unified architecture. Given a Contrastive Language-Image Pretraining (CLIP) embedding specifying the target anatomical structure, the proposed model leverages a collection of datasets covering the diverse structures which can segment any anatomical targets that are presented in the aggregated data. By performing diffusion fully in latent space, we achieve comparable results to pixel-space diffusion with significantly lower computational cost. The proposed methods demonstrates competitive performance against existing deep learning-based discriminative approaches on several benchmarks. Furthermore, it shows strong generalization capabilities on unseen datasets. Conference Paper/Proceeding/Abstract International Conference on AI-Generated Content (AIGC 2024) 13649 7 SPIE 9781510692114 9781510692121 0277-786X 1996-756X Image segmentation; Data modeling; Diffusion; Performance modeling; Education and training; Anatomy; Visual process modeling; 3D modeling; Medical imaging; Denoising 7 7 2025 2025-07-07 10.1117/12.3065214 COLLEGE NANME Mathematics and Computer Science School COLLEGE CODE MACS Swansea University Not Required 2025-07-24T13:02:21.5023347 2024-11-29T11:08:49.4467788 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Mattia Salsi 1 Yunying Wang 2 Chen Hu 3 Yueyue Hu 4 Hanchi Ren 5 Jingjing Deng 6 Xianghua Xie 0000-0002-2701-8660 7 68384__32999__96f9e975c5f640d1a5ee3be1ec3eda8d.pdf Diffusion_Segmentation.pdf 2024-11-29T11:13:45.7541122 Output 1417819 application/pdf Accepted Manuscript true Copyright 2025. Society of Photo‑Optical Instrumentation Engineers (SPIE). One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this publication for a fee or for commercial purposes, and modification of the contents of the publication are prohibited. true eng https://creativecommons.org/licenses/by/4.0/deed.en
title	Guided latent diffusion for universal medical image segmentation
spellingShingle	Guided latent diffusion for universal medical image segmentation Chen Hu Hanchi Ren Xianghua Xie
title_short	Guided latent diffusion for universal medical image segmentation
title_full	Guided latent diffusion for universal medical image segmentation
title_fullStr	Guided latent diffusion for universal medical image segmentation
title_full_unstemmed	Guided latent diffusion for universal medical image segmentation
title_sort	Guided latent diffusion for universal medical image segmentation
author_id_str_mv	55d3ba5f8378c2e3439d7e3962aee726 9e043b899a2b786672a28ed4f864ffcc b334d40963c7a2f435f06d2c26c74e11
author_id_fullname_str_mv	55d3ba5f8378c2e3439d7e3962aee726_*_Chen Hu 9e043b899a2b786672a28ed4f864ffcc__Hanchi Ren b334d40963c7a2f435f06d2c26c74e11_**_Xianghua Xie
author	Chen Hu Hanchi Ren Xianghua Xie
author2	Mattia Salsi Yunying Wang Chen Hu Yueyue Hu Hanchi Ren Jingjing Deng Xianghua Xie
format	Conference Paper/Proceeding/Abstract
container_title	International Conference on AI-Generated Content (AIGC 2024)
container_volume	13649
container_start_page	7
publishDate	2025
institution	Swansea University
isbn	9781510692114 9781510692121
issn	0277-786X 1996-756X
doi_str_mv	10.1117/12.3065214
publisher	SPIE
college_str	Faculty of Science and Engineering
hierarchytype
hierarchy_top_id	facultyofscienceandengineering
hierarchy_top_title	Faculty of Science and Engineering
hierarchy_parent_id	facultyofscienceandengineering
hierarchy_parent_title	Faculty of Science and Engineering
department_str	School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science
document_store_str	1
active_str	0
description	Deep learning based medical segmentation still presents a great challenge due to the lack of large-scale datasets in the medical domain. The existing publicly available datasets vary significantly in terms of imaging modalities and target anatomies. This paper presents a novel guided latent diffusion model for universal medical segmentation, capable of segmenting diverse anatomical structures using a single and unified architecture. Given a Contrastive Language-Image Pretraining (CLIP) embedding specifying the target anatomical structure, the proposed model leverages a collection of datasets covering the diverse structures which can segment any anatomical targets that are presented in the aggregated data. By performing diffusion fully in latent space, we achieve comparable results to pixel-space diffusion with significantly lower computational cost. The proposed methods demonstrates competitive performance against existing deep learning-based discriminative approaches on several benchmarks. Furthermore, it shows strong generalization capabilities on unseen datasets.
published_date	2025-07-07T05:21:49Z
_version_	1858617013210972160
score	11.098272

Guided latent diffusion for universal medical image segmentation

Similar Items