Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Proposes a new challenge for synthesizing a novel view in a practical environment with limited input images and significant illumination variations
Suggests ExtremeNeRF to address the problem, which utilizes occlusion-aware multiview albedo consistency, supported by geometric alignment and depth consistency
Extracts intrinsic image components that are illumination-invariant across different views
Provides extensive experimental results for an evaluation of the task using the newly built NeRF Extreme benchmark

Paper Content

Introduction

Neural radiance fields (NeRF) have had significant impacts on 3D scene reconstruction and novel view synthesis
Various aspects of NeRF have been improved, such as generalization ability, representation ability, and practicality
ExtremeNeRF provides reliable novel view synthesis results compared to the state-of-the-art method
NeRF often struggles to render large-size patches due to complexity
ExtremeNeRF enforces consistency among intrinsic components between input and rendered views
A new NeRF Extreme dataset has been proposed, which is the first in-the-wild multi-view dataset with both indoor and outdoor scenes taken under varying illumination
ExtremeNeRF provides plausible few-shot view synthesis and video rendering results with fewer inputs and varying illumination

Few-shot NeRF leverages knowledge priors to improve performance
Depth and geometry priors are used
NeRF-W enables view synthesis of internet photos taken under varying illumination
Tancik et al. presented a way to synthesize a large-scale city scene captured under varying lighting conditions
Existing datasets focus on varying single attributes like viewing direction or illumination
Li et al. disentangled intrinsic components without supervision
Ye et al. suggested a NeRF framework for scene editing

Preliminaries

Neural radiance fields (NeRF) is a volume rendering-based view-synthesis framework
NeRF maps 5D inputs to color and volume density
Vanila-NeRF view synthesis is done by optimizing mean squared error on synthesized color
RegNeRF uses a depth smoothness regularization for few-shot view synthesis
RegNeRF relies on the assumption that input images should share consistent illumination conditions

Methodology

Overview

Objective: Build illumination-robust few-shot view synthesis framework
Challenges: Geometric alignment, occluded regions, global context
Solution: Offline intrinsic decomposition network, pseudoalbedo ground truth

Intrinsic consistency regularization

Two images are randomly selected from a set of inputs and novel views for each iteration
A projective transformation is used to get a pixel correspondence between the two images
Albedo consistency is imposed between inputs and novel views
Occlusion handling is used to minimize projection errors
Depth consistency loss is used to regularize the scene geometry and reduce floating artifacts

Albedo estimation

Intrinsic decomposition requires global context of a scene
NeRF struggles with large-resolution inputs
Intrinsic rendering methods cannot be used due to lack of input data
Proposed two-stage intrinsic decomposition pipeline: FIDNet and PIDNet
FIDNet extracts intrinsic components of input images offline
PIDNet extracts patch-wise intrinsic components of synthesized color patch at novel view

Total loss functions

Edge-preserving loss ensures gradients of the novel view are the same as the input view
Intrinsic smoothness loss and depth consistency loss are formulated in the same way
Chromaticity consistency loss ensures the chromaticity of the input patch and the extracted albedo are the same

Experiments

Introducing NeRF Extreme, a multi-view dataset with varying illumination
Scenes are not limited to object-centric ones
Details of dataset statistics and experimental settings in supplementary material

Nerf extreme dataset

Collected multi-view images with a variety of light sources
Varied illumination by turn-on/off light sources and closing/opening curtains
Captured outdoor scenes at different times with different sunlights
Images taken in the wild using off-the-shelves camera on mobile phone
Camera poses obtained using COLMAP structure-from-motion framework
Depth maps obtained by multi-view stereo method

Experimental settings

Framework based on JAX and RegNeRF
Used IIDWW code and model without fine-tuning
Experiments on DTU, NeRD, LLFF, and NeRF Extreme datasets
Compared to mip-NeRF and RegNeRF for few-shot view synthesis
Weak comparisons with NeRF-W, NeROIC, and other works

Evaluation metrics

Problem of evaluating novel view synthesis under varying illumination is ill-posed
Synthesizing scene appearances is impossible with input information
Evaluation methods in addition to PSNR and SSIM to compare underlying characteristics of scene regardless of illumination
Metric to measure maintenance of consistent color from synthesized image to inputs
Absolute Relative Error to compare quality of synthesized depth

Experimental results

Proposed ExtremeNeRF outperforms other few-shot view synthesis baselines on NeRF Extreme and light-varying DTU datasets
ExtremeNeRF eliminates distortions from varying illumination inputs
Cross-view consistency regularization helps maintain underlying color consistency and depth map
Albedo consistency regularization helps regularize depth
Occlusion handling prevents consistency regularization between pixels with occlusions
Results prove ExtremeNeRF can maintain physical properties of target scene with sparse inputs and varying illumination
Quantitative comparison shows ExtremeNeRF has lowest CCRD and Abs Rel
Ablation studies show depth consistency term is beneficial for color and depth map
NeROIC-Geom and NeROIC-Full show over-smoothed or diverged results on NeRF Extreme dataset
Relighting results achieved by replacing shading image of synthesized scene
Tent scene results show difficulties in regularizing intrinsic components
Additional qualitative comparisons show ExtremeNeRF performs better than other methods in synthesizing depth

Link to paper#

Abstract#

Paper Content#

Introduction#

Related work#

Preliminaries#

Methodology#

Overview#

Intrinsic consistency regularization#

Albedo estimation#

Total loss functions#

Experiments#

Nerf extreme dataset#

Experimental settings#

Evaluation metrics#

Experimental results#

Link to paper

Abstract

Paper Content

Introduction

Related work

Preliminaries

Methodology

Overview

Intrinsic consistency regularization

Albedo estimation

Total loss functions

Experiments

Nerf extreme dataset

Experimental settings

Evaluation metrics

Experimental results