Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- InDI is a new formulation for supervised image restoration that avoids the “regression to the mean” effect.
- InDI gradually improves image quality in small steps, similar to generative denoising diffusion models.
- InDI does not require knowledge of any analytic form of the degradation process.
- InDI can be applied to virtually any image degradation, given paired training data.
Paper Content
Introduction
- Recovering a high-quality image from a low-quality observation is a fundamental problem in computer vision and computational imaging.
- Supervised approach is to infer the underlying image given a low-quality version of it.
- Common approach is to minimize a pixel reconstruction error using the L1 or L2 loss.
- Point-distortion metrics (e.g. PSNR) do not correlate well to human perception.
- Recent research has focused on improving deep architectures and optimizing a variety of point-loss formulations.
- Approach proposed in paper is to generate a sequence of intermediate restorations to avoid the regression-to-the-mean effect.
Background
- Recently, much work has been done on imaging inverse problems using generative formulations
- Generative adversarial formulations train restoration networks with an adversarial loss
- Image priors can be used to solve inverse problems in an unsupervised fashion
- Denoising Diffusion Probabilistic Models and Score-based models are two powerful classes of generative models
- Conditional DDPM models generate plausible reconstructions given the low-quality input
- Our formulation is straightforward to implement and train, and produces high-quality results
Related work
- Image restoration is the process of generating a high-quality image from a degraded low-quality measurement.
- Supervised image restoration can be done using Denoising Diffusion Probabilistic Models or Score-based models.
- An alternative formulation is to use a conditional denoising diffusion model to generate samples from the posterior distribution.
Indi: our proposed formulation
- Definition of continuous forward degradation process
- Starts from clean sharp image at time t = 0 and degrades to blurry/noisy observation at time t = 1
- Index t referred to as time-step
- Recovery method starts with input degraded image (time t = 1)
- Proposition 4.1 provides cornerstone of approach
- Posterior mean at time s < t can be deduced from estimate at time t
- Scheme to move from t to s = t − δ
- Process starts from x1 = y, step δ < 1 controls “speed” of reverse process
- Require p xt ( xt ) > 0 for iteration procedure to be well defined
- Toy example given
- Train family of regressors F θ (•; t) to reconstruct x from x t at given t
- Iterative scheme converges to one of possible modes
- Add small amount of noise to low-quality input to guarantee regularity requirements
- Algorithm 1 given
Experiments
- Framework is trained and evaluated on four image restoration tasks
- Formulation is generative-based and can be used for image generation
- Quality of proposed method is evaluated using distortion and perceptual metrics
- Results show direct impact of number of steps on distortion-perception tradeoff
- Model architecture is U-Net-like and trained on image crops using ADAM optimizer
Motion deblurring
- Motion deblurring is a difficult task because there is no known degradation model.
- The best current solution is to train regression models using paired data sharp, blurry frames.
- A dataset of 3214 pairs of clean and blurry 1280 × 720 images is used for training.
- Results show that the iterative image restoration produces images with more details than regression based solutions.
Single-image super-resolution
- Evaluated iterative restoration methodology on single-image 4× super-resolution
- Compared to other state-of-the art models
- Proposed framework leads to upscaled images with more defined structure
- Adversarial formulation produces slightly better fine grain details
- Adding a small amount of noise to the input image leads to better results
Defocus deblurring
- Defocus deblurring is the task of reducing blur caused by limited depth-of-field or misfocus.
- A dataset of 1000 pairs of sharp and blurry images was used to train a model.
- Increasing the number of inference steps improves the quality of the result.
Compression artifact removal
- JPEG compression introduces blocking artifacts and lack of high-frequency details
- Proposed method evaluated on task of removing strong JPEG compression artifacts
- Training data generated from 1000 div2k high-quality images
- Model evaluated on div2k validation set
- More inference steps used, more details in restored images
Discussion
A generative framework
- A natural question is whether the proposed approach is generative in the spirit of diffusion formulations.
- A restoration model was trained to start from pure Gaussian noise and Figure 8 shows some generated samples.
- The generated samples have a FID of 9.19, which is not state-of-the-art.
- The two methods have different motivations/formulations and inference strategies.
Comparison of inference algorithms
Impact of distribution p(t)
- Training with a bias towards t = 1 yields best results
- Best distribution depends on model capacity and restoration task
Impact of adding noise on inverting deterministic degradations
- JPEG compression is a non-linear and deterministic degradation.
- Adding a small amount of noise can improve the results.
Comparison to a conditional denoising diffusion model
- InDI is compared to a vanilla conditional DDPM
- A vanilla conditional DDPM is trained using noise level as an additional input
- The model architecture is the same as InDI but the auxiliary noise image is concatenated with the low-quality input
- InDI produces comparable results with fewer steps than the vanilla DDPM
Conclusions and limitations
- Novel formulation of image restoration circumvents regression-to-mean problem
- Allows for restored images with superior realism and perceptual quality
- Restoration task broken into many small, easier problems
- Supervised formulation requires paired training data
- Performance for out-of-distribution samples not guaranteed
- Accumulation of errors can degrade performance
- Future work to better characterize limiting points and develop robust formulations
- Iterative inference algorithm produces high-quality restorations
- Results on 4x SR div2k validation dataset