Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Human organs change due to short-term and long-term factors
Medical image generation tasks rely on single images, ignoring sequential dependency
Sequence-aware deep generative models are underexplored in medical imaging
Sequence-aware diffusion model (SADM) proposed to generate longitudinal medical images
SADM uses sequence-aware transformer to learn longitudinal dependency with missing data

Paper Content

Introduction

Generative models used in medical domain due to hardware and data availability
Two tasks: generation of longitudinal brain image and multi-frame cardiac image
Generative adversarial networks and diffusion models used
Sequence-aware deep generative models to learn temporal dependency
Proposed sequence-aware diffusion model (SADM) to address issues of longitudinal medical data
SADM works with single image input, longitudinal data with missing frames, and high-dimensional images
State-of-the-art results in longitudinal image generation and missing data imputation

Preliminary

Diffusion models

Diffusion models consist of a forward process and a reverse process.
The forward process adds noise to data to create a noisy version.
The reverse process predicts and subtracts the noise in the reverse direction.
The reverse process is parameterized by a generative model.

Classifier-free guidance

Two branches of conditioning methods for diffusion models: classifier guided and classifier-free guidance
Classifier-free guidance used for conditioning the diffusion model
Guidance strength controls tradeoff between sample quality and diversity
Sequence-aware diffusion model (SADM) proposed for longitudinal medical image generation
Transformer-based attention module used to generate conditioning signals for diffusion model
Problem setting: X is a longitudinal 3D medical image with temporal length L
Token embedding: 4D convolution with l = 1
Temporal encoder: self-attention along temporal dimensions
Spatial decoder: self-attention to spatial dimensions between all tokens in same temporal index

Conditional diffusion model

Proposed SADM follows formulation of classifier-free diffusion model
Uses sequence-aware conditioning signal
Autoregressive sampling scheme captures long-distance temporal dependency
Training pipeline defined in Algorithm 1
Autoregressive sampling scheme imputes missing images

Experiments

Proposed SADM effective for medical image generation
Compared to GAN-based and diffusion-based baselines
Ablation study of model components with various input sequence settings

Dataset and implementation

Used the multi-frame cardiac MRI curated by ACDC for a common task of synthesizing the final frame of a cardiac cycle
Resized the dataset to X ∈ R 12×128×128×32 and Min-Max normalized it subject-wise
Followed the classifier-free diffusion model architecture and hyperparameters and modified it into a 3D model
Trained the model for 3 million iterations, which took 150 GPU hours

Comparison with baseline methods

Two state-of-the-art baseline models used for comparison
Augmented baselines with intermediate frames and ES frames
SADM shows better depiction of blood pool and other areas
Ventricular regions synthesized by SADM are more crisp
Quantitative comparison of SSIM, PSNR, and normalized RMSE
SADM outperforms GAN-based method by 3-13% in each metric

Ablation study

Synthesis using full sequence and missing data settings show higher SSIM than single input setting.
Model is learning which frames of the input sequence are important in generating future frames.
Removing either the attention module or the diffusion model results in worse performance.

Conclusion

Diffusion models are computationally expensive
SADM uses an autoregressive sampling scheme
Alternative sampling methods are not suitable for medical domain
SADM uses a transformer-based attention module to learn temporal dependence
Tested SADM on longitudinal cardiac and brain MRI
First to explore temporal dependency in diffusion models for medical image generation
Examples of longitudinal medical images synthesized by SADM shown in Fig. 1
SADM tested with single image, missing data, and full sequence settings
In-house synthesis of longitudinal brain MRI done in two steps
GAN-based registration model used to register subject scans to age-specific templates
Ablation study of SADM components conducted

Link to paper#

Abstract#

Paper Content#

Introduction#

Preliminary#

Diffusion models#

Classifier-free guidance#

Conditional diffusion model#

Experiments#

Dataset and implementation#

Comparison with baseline methods#

Ablation study#

Conclusion#

Link to paper

Abstract

Paper Content

Introduction

Preliminary

Diffusion models

Classifier-free guidance

Conditional diffusion model

Experiments

Dataset and implementation

Comparison with baseline methods

Ablation study

Conclusion