Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Diffusion models are used for zero-shot image restoration
  • Diffusion models are pre-trained and do not require finetuning
  • Current methods only discuss how to deal with fixed-size images
  • This paper focuses on how to use diffusion-based zero-shot IR methods to deal with any size
  • Mask-Shift Restoration and Hierarchical Restoration are proposed to address local incoherence and out-of-domain issues
  • Code is available on GitHub

Paper Content

Introduction

  • Recent progress in diffusion models has improved Image Restoration tasks
  • Diffusion-based IR methods can be divided into supervised and zero-shot
  • Zero-shot methods only need pre-trained off-the-shelf diffusion model
  • Difficulties in applying zero-shot IR methods to arbitrary output size
  • Proposed Mask-Shift Restoration (MSR) to solve boundary artifacts
  • Proposed Hierarchical Restoration (HiR) to address lack of global semantics
  • MSR and HiR are parameter-free and training-free

Preliminaries

Diffusion models

  • Diffusion models have diverse interpretations.
  • Diffusion models define a forward process (adding random noise to data) and a reverse process (constructing desired data samples from the noise).
  • The reverse process estimates the clean image from the noisy image.
  • A denoiser is used to generate the previous state from the noisy image.

Denoising diffusion null-space model (ddnm)

  • Pre-trained diffusion models can be used to solve linear inverse problems without extra training or optimization
  • DDNM starts with noise-free linear image inverse problems
  • Image restoration aims to yield a result that satisfies two constraints
  • General solution to the problem satisfies the Consistency constraint
  • DDNM uses the general solution to find a proper null-space variable to meet the Realness constraint
  • Algorithm 1 shows the whole process of DDNM

Method

  • Diffusion model and DDNM introduced
  • Limitation of image processing size lies in denoiser
  • Pre-trained denoisers used for unlimited-size image restoration
  • Two methods proposed to achieve this goal

Process as a whole image

  • Typical diffusion models use U-Net structures as the denoiser backbone.
  • U-Net is a convolutional network and supports scalable input size.
  • Diffusion models trained on fixed image size may face Out-Of-Domain (OOD) problem when applied to other image sizes.
  • One way to solve the OOD issue is to train the denoiser with a random cropped dataset.

Process as patches

  • Directly changing the model processing size may yield bad results when facing OOD problems
  • Limitations on image size, e.g. divisible by 32
  • Large sizes, e.g. 1024x1024, may cause unaffordable memory consumption
  • Diffusion models with fixed processing sizes cannot solve arbitrary image sizes

Mask-shift restoration

  • Inpainting is a typical image restoration task
  • Zero-shot methods like DDNM and RePaint show good performance in solving inpainting
  • Overlapped regions can be used as an extra constraint when solving patches
  • This constraint can be integrated into existing zero-shot methods
  • Example of 4xSR task given with an input image of size 64x96 and aim to get an SR result of size 256x384

Hierarchical restoration

  • MSR has a small receptive field when dealing with large images.
  • This can lead to poor semantic information recovery.
  • HiR is proposed to extend the receptive field for better semantic restoration.
  • HiR consists of two phases: semantic restoration and texture restoration.
  • HiR is not limited to inpainting tasks, but is also useful for large-scale SR and colorization.

Flexible pipeline for applications

  • MSR and HiR are patch connection and quality improvement technologies.
  • MSR and HiR use prior knowledge to reduce the solution space.
  • MSR and HiR can be implemented using Range-Null space Decomposition.
  • MSR and HiR can be used for other zero-shot IR methods.

Experiment

  • Experiments use denoiser pre-trained on ImageNet 256x256
  • Classifier guidance and time-travel sampling used to improve generative quality
  • Desired result size divided into patches of 256x256 with 128 pixel overlaps
  • First patch solved using original DDNM, following patches solved using MSR based on DDNM
  • Results on 4x SR with T = 100, time-travel length l = 10, repeat times r = 3
  • Range-Null space Decomposition (RND) is a concept in linear algebra
  • RND defines the upper limit of recoverable information
  • GAN Prior is used to learn the Null-space
  • Diffusion sampling is used to learn the Null-space
  • Diffusion-based Zero-Shot Image Restoration Methods are divided into RND-based and optimization-based

Limitations & discussions

  • Zero-shot IR methods using diffusion models open up a promising new direction for IR problems
  • Method proposed in paper enables unlimited image size
  • Limitations include more calculation and time consumption than supervised methods, ceiling of performance depends on pre-trained diffusion models, and degradation operator is needed
  • MSR can be seen as a general image connection method
  • Experiments on 4x SR and noisy 4x SR of different sizes
  • Compared with BSRGAN, method performs better in realness and consistency
  • HiR used for colorization