Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Human organs change due to short-term and long-term factors
- Medical image generation tasks rely on single images, ignoring sequential dependency
- Sequence-aware deep generative models are underexplored in medical imaging
- Sequence-aware diffusion model (SADM) proposed to generate longitudinal medical images
- SADM uses sequence-aware transformer to learn longitudinal dependency with missing data
Paper Content
Introduction
- Generative models used in medical domain due to hardware and data availability
- Two tasks: generation of longitudinal brain image and multi-frame cardiac image
- Generative adversarial networks and diffusion models used
- Sequence-aware deep generative models to learn temporal dependency
- Proposed sequence-aware diffusion model (SADM) to address issues of longitudinal medical data
- SADM works with single image input, longitudinal data with missing frames, and high-dimensional images
- State-of-the-art results in longitudinal image generation and missing data imputation
Preliminary
Diffusion models
- Diffusion models consist of a forward process and a reverse process.
- The forward process adds noise to data to create a noisy version.
- The reverse process predicts and subtracts the noise in the reverse direction.
- The reverse process is parameterized by a generative model.
Classifier-free guidance
- Two branches of conditioning methods for diffusion models: classifier guided and classifier-free guidance
- Classifier-free guidance used for conditioning the diffusion model
- Guidance strength controls tradeoff between sample quality and diversity
- Sequence-aware diffusion model (SADM) proposed for longitudinal medical image generation
- Transformer-based attention module used to generate conditioning signals for diffusion model
- Problem setting: X is a longitudinal 3D medical image with temporal length L
- Token embedding: 4D convolution with l = 1
- Temporal encoder: self-attention along temporal dimensions
- Spatial decoder: self-attention to spatial dimensions between all tokens in same temporal index
Conditional diffusion model
- Proposed SADM follows formulation of classifier-free diffusion model
- Uses sequence-aware conditioning signal
- Autoregressive sampling scheme captures long-distance temporal dependency
- Training pipeline defined in Algorithm 1
- Autoregressive sampling scheme imputes missing images
Experiments
- Proposed SADM effective for medical image generation
- Compared to GAN-based and diffusion-based baselines
- Ablation study of model components with various input sequence settings
Dataset and implementation
- Used the multi-frame cardiac MRI curated by ACDC for a common task of synthesizing the final frame of a cardiac cycle
- Resized the dataset to X โ R 12ร128ร128ร32 and Min-Max normalized it subject-wise
- Followed the classifier-free diffusion model architecture and hyperparameters and modified it into a 3D model
- Trained the model for 3 million iterations, which took 150 GPU hours
Comparison with baseline methods
- Two state-of-the-art baseline models used for comparison
- Augmented baselines with intermediate frames and ES frames
- SADM shows better depiction of blood pool and other areas
- Ventricular regions synthesized by SADM are more crisp
- Quantitative comparison of SSIM, PSNR, and normalized RMSE
- SADM outperforms GAN-based method by 3-13% in each metric
Ablation study
- Synthesis using full sequence and missing data settings show higher SSIM than single input setting.
- Model is learning which frames of the input sequence are important in generating future frames.
- Removing either the attention module or the diffusion model results in worse performance.
Conclusion
- Diffusion models are computationally expensive
- SADM uses an autoregressive sampling scheme
- Alternative sampling methods are not suitable for medical domain
- SADM uses a transformer-based attention module to learn temporal dependence
- Tested SADM on longitudinal cardiac and brain MRI
- First to explore temporal dependency in diffusion models for medical image generation
- Examples of longitudinal medical images synthesized by SADM shown in Fig. 1
- SADM tested with single image, missing data, and full sequence settings
- In-house synthesis of longitudinal brain MRI done in two steps
- GAN-based registration model used to register subject scans to age-specific templates
- Ablation study of SADM components conducted