Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • InDI is a new formulation for supervised image restoration that avoids the “regression to the mean” effect.
  • InDI gradually improves image quality in small steps, similar to generative denoising diffusion models.
  • InDI does not require knowledge of any analytic form of the degradation process.
  • InDI can be applied to virtually any image degradation, given paired training data.

Paper Content

Introduction

  • Recovering a high-quality image from a low-quality observation is a fundamental problem in computer vision and computational imaging.
  • Supervised approach is to infer the underlying image given a low-quality version of it.
  • Common approach is to minimize a pixel reconstruction error using the L1 or L2 loss.
  • Point-distortion metrics (e.g. PSNR) do not correlate well to human perception.
  • Recent research has focused on improving deep architectures and optimizing a variety of point-loss formulations.
  • Approach proposed in paper is to generate a sequence of intermediate restorations to avoid the regression-to-the-mean effect.

Background

  • Recently, much work has been done on imaging inverse problems using generative formulations
  • Generative adversarial formulations train restoration networks with an adversarial loss
  • Image priors can be used to solve inverse problems in an unsupervised fashion
  • Denoising Diffusion Probabilistic Models and Score-based models are two powerful classes of generative models
  • Conditional DDPM models generate plausible reconstructions given the low-quality input
  • Our formulation is straightforward to implement and train, and produces high-quality results
  • Image restoration is the process of generating a high-quality image from a degraded low-quality measurement.
  • Supervised image restoration can be done using Denoising Diffusion Probabilistic Models or Score-based models.
  • An alternative formulation is to use a conditional denoising diffusion model to generate samples from the posterior distribution.

Indi: our proposed formulation

  • Definition of continuous forward degradation process
  • Starts from clean sharp image at time t = 0 and degrades to blurry/noisy observation at time t = 1
  • Index t referred to as time-step
  • Recovery method starts with input degraded image (time t = 1)
  • Proposition 4.1 provides cornerstone of approach
  • Posterior mean at time s < t can be deduced from estimate at time t
  • Scheme to move from t to s = t − δ
  • Process starts from x1 = y, step δ < 1 controls “speed” of reverse process
  • Require p xt ( xt ) > 0 for iteration procedure to be well defined
  • Toy example given
  • Train family of regressors F θ (•; t) to reconstruct x from x t at given t
  • Iterative scheme converges to one of possible modes
  • Add small amount of noise to low-quality input to guarantee regularity requirements
  • Algorithm 1 given

Experiments

  • Framework is trained and evaluated on four image restoration tasks
  • Formulation is generative-based and can be used for image generation
  • Quality of proposed method is evaluated using distortion and perceptual metrics
  • Results show direct impact of number of steps on distortion-perception tradeoff
  • Model architecture is U-Net-like and trained on image crops using ADAM optimizer

Motion deblurring

  • Motion deblurring is a difficult task because there is no known degradation model.
  • The best current solution is to train regression models using paired data sharp, blurry frames.
  • A dataset of 3214 pairs of clean and blurry 1280 × 720 images is used for training.
  • Results show that the iterative image restoration produces images with more details than regression based solutions.

Single-image super-resolution

  • Evaluated iterative restoration methodology on single-image 4× super-resolution
  • Compared to other state-of-the art models
  • Proposed framework leads to upscaled images with more defined structure
  • Adversarial formulation produces slightly better fine grain details
  • Adding a small amount of noise to the input image leads to better results

Defocus deblurring

  • Defocus deblurring is the task of reducing blur caused by limited depth-of-field or misfocus.
  • A dataset of 1000 pairs of sharp and blurry images was used to train a model.
  • Increasing the number of inference steps improves the quality of the result.

Compression artifact removal

  • JPEG compression introduces blocking artifacts and lack of high-frequency details
  • Proposed method evaluated on task of removing strong JPEG compression artifacts
  • Training data generated from 1000 div2k high-quality images
  • Model evaluated on div2k validation set
  • More inference steps used, more details in restored images

Discussion

A generative framework

  • A natural question is whether the proposed approach is generative in the spirit of diffusion formulations.
  • A restoration model was trained to start from pure Gaussian noise and Figure 8 shows some generated samples.
  • The generated samples have a FID of 9.19, which is not state-of-the-art.
  • The two methods have different motivations/formulations and inference strategies.

Comparison of inference algorithms

Impact of distribution p(t)

  • Training with a bias towards t = 1 yields best results
  • Best distribution depends on model capacity and restoration task

Impact of adding noise on inverting deterministic degradations

  • JPEG compression is a non-linear and deterministic degradation.
  • Adding a small amount of noise can improve the results.

Comparison to a conditional denoising diffusion model

  • InDI is compared to a vanilla conditional DDPM
  • A vanilla conditional DDPM is trained using noise level as an additional input
  • The model architecture is the same as InDI but the auxiliary noise image is concatenated with the low-quality input
  • InDI produces comparable results with fewer steps than the vanilla DDPM

Conclusions and limitations

  • Novel formulation of image restoration circumvents regression-to-mean problem
  • Allows for restored images with superior realism and perceptual quality
  • Restoration task broken into many small, easier problems
  • Supervised formulation requires paired training data
  • Performance for out-of-distribution samples not guaranteed
  • Accumulation of errors can degrade performance
  • Future work to better characterize limiting points and develop robust formulations
  • Iterative inference algorithm produces high-quality restorations
  • Results on 4x SR div2k validation dataset