Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Diffusion models are used for zero-shot image restoration
- Diffusion models are pre-trained and do not require finetuning
- Current methods only discuss how to deal with fixed-size images
- This paper focuses on how to use diffusion-based zero-shot IR methods to deal with any size
- Mask-Shift Restoration and Hierarchical Restoration are proposed to address local incoherence and out-of-domain issues
- Code is available on GitHub
Paper Content
Introduction
- Recent progress in diffusion models has improved Image Restoration tasks
- Diffusion-based IR methods can be divided into supervised and zero-shot
- Zero-shot methods only need pre-trained off-the-shelf diffusion model
- Difficulties in applying zero-shot IR methods to arbitrary output size
- Proposed Mask-Shift Restoration (MSR) to solve boundary artifacts
- Proposed Hierarchical Restoration (HiR) to address lack of global semantics
- MSR and HiR are parameter-free and training-free
Preliminaries
Diffusion models
- Diffusion models have diverse interpretations.
- Diffusion models define a forward process (adding random noise to data) and a reverse process (constructing desired data samples from the noise).
- The reverse process estimates the clean image from the noisy image.
- A denoiser is used to generate the previous state from the noisy image.
Denoising diffusion null-space model (ddnm)
- Pre-trained diffusion models can be used to solve linear inverse problems without extra training or optimization
- DDNM starts with noise-free linear image inverse problems
- Image restoration aims to yield a result that satisfies two constraints
- General solution to the problem satisfies the Consistency constraint
- DDNM uses the general solution to find a proper null-space variable to meet the Realness constraint
- Algorithm 1 shows the whole process of DDNM
Method
- Diffusion model and DDNM introduced
- Limitation of image processing size lies in denoiser
- Pre-trained denoisers used for unlimited-size image restoration
- Two methods proposed to achieve this goal
Process as a whole image
- Typical diffusion models use U-Net structures as the denoiser backbone.
- U-Net is a convolutional network and supports scalable input size.
- Diffusion models trained on fixed image size may face Out-Of-Domain (OOD) problem when applied to other image sizes.
- One way to solve the OOD issue is to train the denoiser with a random cropped dataset.
Process as patches
- Directly changing the model processing size may yield bad results when facing OOD problems
- Limitations on image size, e.g. divisible by 32
- Large sizes, e.g. 1024x1024, may cause unaffordable memory consumption
- Diffusion models with fixed processing sizes cannot solve arbitrary image sizes
Mask-shift restoration
- Inpainting is a typical image restoration task
- Zero-shot methods like DDNM and RePaint show good performance in solving inpainting
- Overlapped regions can be used as an extra constraint when solving patches
- This constraint can be integrated into existing zero-shot methods
- Example of 4xSR task given with an input image of size 64x96 and aim to get an SR result of size 256x384
Hierarchical restoration
- MSR has a small receptive field when dealing with large images.
- This can lead to poor semantic information recovery.
- HiR is proposed to extend the receptive field for better semantic restoration.
- HiR consists of two phases: semantic restoration and texture restoration.
- HiR is not limited to inpainting tasks, but is also useful for large-scale SR and colorization.
Flexible pipeline for applications
- MSR and HiR are patch connection and quality improvement technologies.
- MSR and HiR use prior knowledge to reduce the solution space.
- MSR and HiR can be implemented using Range-Null space Decomposition.
- MSR and HiR can be used for other zero-shot IR methods.
Experiment
- Experiments use denoiser pre-trained on ImageNet 256x256
- Classifier guidance and time-travel sampling used to improve generative quality
- Desired result size divided into patches of 256x256 with 128 pixel overlaps
- First patch solved using original DDNM, following patches solved using MSR based on DDNM
- Results on 4x SR with T = 100, time-travel length l = 10, repeat times r = 3
Related work
- Range-Null space Decomposition (RND) is a concept in linear algebra
- RND defines the upper limit of recoverable information
- GAN Prior is used to learn the Null-space
- Diffusion sampling is used to learn the Null-space
- Diffusion-based Zero-Shot Image Restoration Methods are divided into RND-based and optimization-based
Limitations & discussions
- Zero-shot IR methods using diffusion models open up a promising new direction for IR problems
- Method proposed in paper enables unlimited image size
- Limitations include more calculation and time consumption than supervised methods, ceiling of performance depends on pre-trained diffusion models, and degradation operator is needed
- MSR can be seen as a general image connection method
- Experiments on 4x SR and noisy 4x SR of different sizes
- Compared with BSRGAN, method performs better in realness and consistency
- HiR used for colorization