Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • NeRF is a method for synthesizing novel views from a dense set of images
  • NeRF is limited by the need for numerous calibrated views and its accuracy decreases in a few-shot setting
  • Self-NeRF is proposed to address this challenge, which iteratively refines the radiance fields with few input views
  • Uncertainty-aware NeRF is constructed with specialized embeddings and cone entropy regularization to leverage the pseudo-views
  • Self-NeRF is robust to input with uncertainty and outperforms existing methods when trained on limited data

Paper Content

Introduction

  • Synthesizing novel camera views is an important task in computer vision.
  • Classic techniques have addressed this problem using structure-from-motion or light fields.
  • Neural Radiance Fields (NeRF) have gained popularity due to impressive results in photo-realistic rendering.
  • When the known views are limited, NeRF can collapse to trivial solutions.
  • Previous works attempt to incorporate additional priors to make the problem tractable.
  • Self-NeRF is proposed to solve the few-shot novel view synthesis task without additional priors.
  • Self-NeRF leverages an uncertainty-aware NeRF, specialized embeddings and a cone entropy regularization.
  • Self-NeRF shows state-of-the-art performance on few-shot novel view synthesis.

Novel view synthesis

  • Earlier works used view interpolation and light fields to reconstruct novel views
  • Some works utilized proxy geometry and explicit representations such as layered representations, voxel, mesh and point cloud
  • Growing attention on learning-based methods
  • Volumetric representations used to address photorealistic view synthesis
  • Improvements in rendering speed, artistic effects and generalization ability of NeRF

Few-shot view synthesis

  • NeRF requires many calibrated images
  • Some works attempt to reduce data-hungriness by using depth priors
  • Other methods use multi-view stereo methods to produce a multi-view feature volume
  • Recently, some studies have used warped views with few-shot images to improve neural radiance fields
  • Some methods are prior-free and minimize ray entropy among seen and unseen poses

Method

Preliminaries

  • Neural radiance fields represent a 3D scene as a continuous implicit function
  • NeRF adapts a multi-layer perceptron model to predict volume density and color
  • Mip-NeRF casts a cone that passes through the pixel’s center to reduce aliasing artifacts
  • Mip-NeRF fails to generalize well to novel views at test time

Pseudo-views in self-nerf

  • Gather pseudo-views synthesized by f i−1 θ for unseen views
  • Generate warped pseudoviews through forward warping
  • Pixels in warped pseudo-views are reprojected from seen views
  • Adapt mip-NeRF to be tolerant of uncertainty

Uncertainty-aware model in self-nerf

  • Model adds two specialized embeddings and a branch to emit a field of uncertainty
  • Warping embeddings distinguish warped pseudo-views from predicted pseudo-views
  • Uncertain embeddings model per-image uncertain colors
  • Model relaxes strict consistency assumption and attenuates negative impact of uncertainty

Inference and optimization

  • Model predicted color with normal distribution
  • Calculate color with equations 2 and 4
  • Render depth with equation 6
  • Calculate RGB loss with negative log-likelihood
  • Calculate pseudo loss with pseudoviews
  • Regularize cone tracing with cone entropy loss

Convergence analysis of self-nerf

  • Lee et al. have analyzed self-learning technique and proved it is equivalent to entropy regularization
  • Unlabeled data can improve generalization performance even if pseudo labels are not precise
  • Self-NeRF has an upper bound and can be used to determine when it has converged

Experiments

Experimental settings

  • Compared to NeRF, DietNeRF and InfoNeRF
  • Evaluated on NeRF synthetic dataset and LLFF dataset
  • Metrics used: PSNR, SSIM, LPIPS
  • Implemented with Py-Torch and Adam optimizer
  • Self-NeRF outperforms other methods in terms of all evaluation metrics
  • Self-NeRF produces more realistic renderings with fewer artifacts

Ablation study

  • Performed an ablation study on two scenes from NeRF synthetic dataset
  • Studied effectiveness of uncertainty-aware NeRF by replacing it with mip-NeRF and NeRF-W
  • Quantitative and qualitative results given in Table 2 and Figure 7

Analysis

  • Our method gradually improves performance with increasing number of training views.
  • Performance advantage of Self-NeRF reaches saturation point with 16 training images.

Conclusion

  • Propose Self-NeRF to synthesize novel views given few-shot images
  • Iteratively generate pseudo-views and train model with seen views and pseudo-views
  • Two categories of pseudo-views: predicted and warped
  • Stabilizing effect and alleviate color shifts
  • Uncertainty-aware NeRF with specialized embeddings
  • Cone entropy regularization to reconstruct fine details
  • Competitiveness compared to state-of-the-art models