Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Diffusion models generate realistic objects from complex data distributions.
Diffusion models are computationally inefficient.
Branched diffusion models offer improvements in efficient generation from multiple classes.
Branched diffusion models offer advantages such as ease of extension and interpretability.

Paper Content

Introduction

Diffusion models are popular for generating complex data distributions
Diffusion models have been successful in generating images, videos, graphs, and tabular data
Generating samples from diffusion models is computationally costly
Training diffusion models can be a limitation in continual learning
Branched diffusion models offer major improvements in computational efficiency for multi-class sample generation
Branched diffusion models can generate new data classes efficiently
Branched diffusion models offer interpretability into the diffusion and generation processes

Song et al. (2021) proposed a method of conditional generation by class label
Ho et al. (2021) proposed an alternative method of classifier-free conditional generation
Classifier-free method can be applied to both continuous- and discrete-time diffusion models
Several diverse methods have been proposed to improve training or sampling efficiency

Hierarchically branched diffusion models

Proposed hierarchically branched diffusion structure for multi-class conditional generation
Diffusion process shared between closely related classes at later time intervals
Branches constrained such that each class and time can be assigned to one branch
Branches form a rooted tree from t=T to t=0
Optimal branch definitions can be computed from data or prior domain knowledge
Implemented as one multi-task neural network
Training follows same procedure as standard linear diffusion model
Demonstrated on two datasets of different modalities
Visualized random sample of digits generated from branched diffusion model
Branched model generated letters that are realistic and true to training data
Conditional generation of distinct class requires no more computation than unconditional diffusion model
Label-guided diffusion models trained on same data
Branched diffusion models achieved similar generative performance compared to label-guided models
Branched diffusion models also able to perform in discrete-time diffusion settings

Efficient sampling from branched diffusion models

Branched diffusion models offer improved computational efficiency when sampling from multiple classes.
Sampling from a diffusion model is computationally expensive due to its iterative nature.
Branched diffusion models share much of the diffusion process for multiple classes.
Branched models take the same amount of computation to generate samples of a single class compared to a linear model.
Branched models enjoy significant savings in computational efficiency when sampling multiple classes.
Speedup factor for multi-class sample generation depends on the branching structure between the classes.

Extending branched diffusion models to novel classes

Branched diffusion models have a hierarchical structure that allows new classes to be added easily.
A small branched diffusion model was trained on three MNIST classes and then a new digit class (7) was added.
The new branch was fine-tuned only on 7s, and the model was able to generate high-quality 7s without affecting the other digits.
A label-guided (linear) diffusion model was also trained on 0s, 4s, and 9s, and when a new digit class (7) was added, the model suffered from catastrophic forgetting.
Retraining the linear model on all data would take much longer than training the branched diffusion model, and the linear model still experienced inappropriate influence from the new task.

Interpretability of branched diffusion models

Branched diffusion models are efficient and extendable.
Branched diffusion models can reveal insight into common features between classes.
Branched diffusion models can generate analogous objects between different classes.

Hybrids at branch points

Branch points act as minimal times of indistinguishability
Classes split off at branch points and start reverse diffusing
Two branches meet at a branch point when classes become noisy
Hybrid objects are generated from reverse diffusion
Hybrids show shared characteristics between classes
Hybrids act as a smooth transition between two endpoints

Transmutation between classes

Diffusion models can traverse the diffusion process both forward and in reverse.
Branched diffusion models can be used to start with an object from one class and generate the analogous object in a different class.
On the MNIST branched diffusion model, 4s and 9s were transmuted based on the slantedness of the digit.
On the tabular branched diffusion model, letters were transmuted between V and Y and all features showed a positive correlation.

Discussion

Branched diffusion models represent a tradeoff between training and sampling complexity
Branched models require more parameters and more training time than linear counterparts, but have significant savings in computational efficiency during sampling
Branched diffusion models require branch points to be times of indistinguishability between classes, which can be challenging for certain image datasets
Branched diffusion can easily accommodate other modalities such as tabular data, graphs, and text

Conclusion

Proposed a novel form of diffusion models which introduces branch points to encode hierarchical relationship between data classes
Branched diffusion models offer an alternative framework of conditional generation for discrete classes
Flexibly applied to many traditional diffusion-model paradigms
More efficient to sample multiple classes from a branched diffusion model than a linear one
Easily extendable to new classes through a short and efficient finetuning step
Can offer insights into interpretability
Reverse-diffusion intermediates at branch points are hybrids which encode shared or interpolated characteristics of multiple data classes
Can transmute objects from one class into the analogous object in another class
Trained models and performed analyses on a single Nvidia Quadro P6000
Used MNIST and tabular letter-recognition datasets
Used variance-preserving stochastic differential equation for continuous-time diffusion models
Used discrete-time Gaussian noising process for discrete-time diffusion model
Computed branch definitions using an algorithm
Used UNet and dense architectures for MNIST and tabular letters models respectively
Used group normalization after every layer
Used time and label embeddings
Trained with a batch size of 128 examples
Used learning rate of 0.001
Trained for different epochs for different models
Used predictor-corrector algorithm for continuous-time diffusion models
Used sampling algorithm for discrete-time diffusion model
Compared quality of samples generated from branched and linear diffusion models using Fréchet Inception Distance
Leveraged branching structure to generate samples

Link to paper#

Abstract#

Paper Content#

Introduction#

Related work#

Hierarchically branched diffusion models#

Efficient sampling from branched diffusion models#

Extending branched diffusion models to novel classes#

Interpretability of branched diffusion models#

Hybrids at branch points#

Transmutation between classes#

Discussion#

Conclusion#

Link to paper

Abstract

Paper Content

Introduction

Related work

Hierarchically branched diffusion models

Efficient sampling from branched diffusion models

Extending branched diffusion models to novel classes

Interpretability of branched diffusion models

Hybrids at branch points

Transmutation between classes

Discussion

Conclusion