Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- NeRF is a method for synthesizing novel views from a dense set of images
- NeRF is limited by the need for numerous calibrated views and its accuracy decreases in a few-shot setting
- Self-NeRF is proposed to address this challenge, which iteratively refines the radiance fields with few input views
- Uncertainty-aware NeRF is constructed with specialized embeddings and cone entropy regularization to leverage the pseudo-views
- Self-NeRF is robust to input with uncertainty and outperforms existing methods when trained on limited data
Paper Content
Introduction
- Synthesizing novel camera views is an important task in computer vision.
- Classic techniques have addressed this problem using structure-from-motion or light fields.
- Neural Radiance Fields (NeRF) have gained popularity due to impressive results in photo-realistic rendering.
- When the known views are limited, NeRF can collapse to trivial solutions.
- Previous works attempt to incorporate additional priors to make the problem tractable.
- Self-NeRF is proposed to solve the few-shot novel view synthesis task without additional priors.
- Self-NeRF leverages an uncertainty-aware NeRF, specialized embeddings and a cone entropy regularization.
- Self-NeRF shows state-of-the-art performance on few-shot novel view synthesis.
Related work
Novel view synthesis
- Earlier works used view interpolation and light fields to reconstruct novel views
- Some works utilized proxy geometry and explicit representations such as layered representations, voxel, mesh and point cloud
- Growing attention on learning-based methods
- Volumetric representations used to address photorealistic view synthesis
- Improvements in rendering speed, artistic effects and generalization ability of NeRF
Few-shot view synthesis
- NeRF requires many calibrated images
- Some works attempt to reduce data-hungriness by using depth priors
- Other methods use multi-view stereo methods to produce a multi-view feature volume
- Recently, some studies have used warped views with few-shot images to improve neural radiance fields
- Some methods are prior-free and minimize ray entropy among seen and unseen poses
Method
Preliminaries
- Neural radiance fields represent a 3D scene as a continuous implicit function
- NeRF adapts a multi-layer perceptron model to predict volume density and color
- Mip-NeRF casts a cone that passes through the pixel’s center to reduce aliasing artifacts
- Mip-NeRF fails to generalize well to novel views at test time
Pseudo-views in self-nerf
- Gather pseudo-views synthesized by f i−1 θ for unseen views
- Generate warped pseudoviews through forward warping
- Pixels in warped pseudo-views are reprojected from seen views
- Adapt mip-NeRF to be tolerant of uncertainty
Uncertainty-aware model in self-nerf
- Model adds two specialized embeddings and a branch to emit a field of uncertainty
- Warping embeddings distinguish warped pseudo-views from predicted pseudo-views
- Uncertain embeddings model per-image uncertain colors
- Model relaxes strict consistency assumption and attenuates negative impact of uncertainty
Inference and optimization
- Model predicted color with normal distribution
- Calculate color with equations 2 and 4
- Render depth with equation 6
- Calculate RGB loss with negative log-likelihood
- Calculate pseudo loss with pseudoviews
- Regularize cone tracing with cone entropy loss
Convergence analysis of self-nerf
- Lee et al. have analyzed self-learning technique and proved it is equivalent to entropy regularization
- Unlabeled data can improve generalization performance even if pseudo labels are not precise
- Self-NeRF has an upper bound and can be used to determine when it has converged
Experiments
Experimental settings
- Compared to NeRF, DietNeRF and InfoNeRF
- Evaluated on NeRF synthetic dataset and LLFF dataset
- Metrics used: PSNR, SSIM, LPIPS
- Implemented with Py-Torch and Adam optimizer
- Self-NeRF outperforms other methods in terms of all evaluation metrics
- Self-NeRF produces more realistic renderings with fewer artifacts
Ablation study
- Performed an ablation study on two scenes from NeRF synthetic dataset
- Studied effectiveness of uncertainty-aware NeRF by replacing it with mip-NeRF and NeRF-W
- Quantitative and qualitative results given in Table 2 and Figure 7
Analysis
- Our method gradually improves performance with increasing number of training views.
- Performance advantage of Self-NeRF reaches saturation point with 16 training images.
Conclusion
- Propose Self-NeRF to synthesize novel views given few-shot images
- Iteratively generate pseudo-views and train model with seen views and pseudo-views
- Two categories of pseudo-views: predicted and warped
- Stabilizing effect and alleviate color shifts
- Uncertainty-aware NeRF with specialized embeddings
- Cone entropy regularization to reconstruct fine details
- Competitiveness compared to state-of-the-art models