No Cover Image

Conference Paper/Proceeding/Abstract 690 views 185 downloads

Guided latent diffusion for universal medical image segmentation

Mattia Salsi, Yunying Wang, Chen Hu, Yueyue Hu, Hanchi Ren, Jingjing Deng, Xianghua Xie Orcid Logo

International Conference on AI-Generated Content (AIGC 2024), Volume: 13649, Start page: 7

Swansea University Authors: Chen Hu, Hanchi Ren, Xianghua Xie Orcid Logo

  • Diffusion_Segmentation.pdf

    PDF | Accepted Manuscript

    Copyright 2025. Society of Photo‑Optical Instrumentation Engineers (SPIE). One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this publication for a fee or for commercial purposes, and modification of the contents of the publication are prohibited.

    Download (1.35MB)

Check full text

DOI (Published version): 10.1117/12.3065214

Abstract

Deep learning based medical segmentation still presents a great challenge due to the lack of large-scale datasets in the medical domain. The existing publicly available datasets vary significantly in terms of imaging modalities and target anatomies. This paper presents a novel guided latent diffusio...

Full description

Published in: International Conference on AI-Generated Content (AIGC 2024)
ISBN: 9781510692114 9781510692121
ISSN: 0277-786X 1996-756X
Published: SPIE 2025
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa68384
Abstract: Deep learning based medical segmentation still presents a great challenge due to the lack of large-scale datasets in the medical domain. The existing publicly available datasets vary significantly in terms of imaging modalities and target anatomies. This paper presents a novel guided latent diffusion model for universal medical segmentation, capable of segmenting diverse anatomical structures using a single and unified architecture. Given a Contrastive Language-Image Pretraining (CLIP) embedding specifying the target anatomical structure, the proposed model leverages a collection of datasets covering the diverse structures which can segment any anatomical targets that are presented in the aggregated data. By performing diffusion fully in latent space, we achieve comparable results to pixel-space diffusion with significantly lower computational cost. The proposed methods demonstrates competitive performance against existing deep learning-based discriminative approaches on several benchmarks. Furthermore, it shows strong generalization capabilities on unseen datasets.
Keywords: Image segmentation; Data modeling; Diffusion; Performance modeling; Education and training; Anatomy; Visual process modeling; 3D modeling; Medical imaging; Denoising
College: Faculty of Science and Engineering
Start Page: 7