Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Novel view synthesis with sparse inputs is a challenging problem for neural radiance fields (NeRF).
Frequency plays an important role in NeRF’s training.
Two regularization terms are proposed to address the challenge.
FreeNeRF achieves state-of-the-art performance across diverse datasets.

Paper Content

Introduction

Neural Radiance Field (NeRF) can render high-fidelity novel views but struggles with few inputs.
Existing methods address this challenge with transfer learning, depth-supervision, and patch-based regularization.
FreeNeRF is a simple baseline that requires minimal modifications to NeRF and outperforms existing methods.
FreeNeRF is dependency-free and overhead-free.
Frequency regularization stabilizes the learning process and avoids overfitting.
Occlusion regularization penalizes near-camera density fields.

Neural fields use deep neural networks to represent 2D images or 3D scenes as continuous functions.
NeRF requires hundreds of input images to learn high-quality scene representations.
Few-shot Neural Rendering attempts to address the challenging few-shot neural rendering problem by leveraging extra information.
Positional encoding lies at the heart of NeRF’s success.
Frequency curriculum is used to tackle the few-shot neural rendering problem.

Method

Hyperparameter L controls maximum encoded frequency
Raw inputs are concatenated with frequency-encoded inputs
Ray is cast from camera’s origin along direction to pass through pixel
Color of ray is computed using quadrature of K sampled points

Frequency regularization

Few-shot neural rendering is prone to overfitting
NeRF learns 3D scene representations from 2D images
Given few input views, NeRF is prone to overfitting
High-frequency inputs exacerbate overfitting
Removing high-frequency components avoids catastrophic failure
Frequency regularization circumvents high-frequency signals

Occlusion regularization

Frequency regularization does not solve all problems in few-shot neural rendering.
Certain characteristic artifacts may still exist in novel views, such as “walls” or “floaters”.
These failure patterns originate from the least overlapped regions in the training views.
Occlusion regularization proposed to penalize dense fields near the camera.

Experiments

Setups

Evaluated method on three datasets in few-shot settings
Used PSNR, SSIM, and LPIPS scores as quantitative results
Used geometric mean of MSE, SSIM, and LPIPS
Implemented method on top of DietNeRF and RegNeRF codebases
Set end iteration of frequency regularization to 90%, 70%, and 20%
Used weight of 0.01 for L occ regularization
Used M = 20 for LLFF and Blender, M = 10 for DTU, M = 15 for black/white colors in DTU

Comparison

FreeNeRF outperforms state-of-the-art methods in terms of novel view synthesis quality and computation overhead
FreeNeRF has higher PSNR and SSIM scores than other methods
DietNeRF implicitly distills semantic information from a pre-trained CLIP model
FreeNeRF outperforms transfer learning-based methods in terms of PSNR and SSIM scores
FreeNeRF produces higher-quality results than RegNeRF
FreeNeRF is a lightweight and efficient solution for few-shot neural rendering problems
FreeNeRF does not require additional steps
FreeNeRF suffers less from “floaters” than ReNeRF

Ablation study

Used batch size of 1024 instead of 4096 for main experiments
Investigated impact of frequency regularization duration
90% schedule best for PSNR score
Trade-off between PSNR and LPIPS
Occlusion regularization improves results
Distortion loss worsens results
Occlusion regularization can cause over-regularization

Conclusion

We have presented FreeNeRF, a streamlined approach to few-shot neural rendering
We studied the relation between input frequency and the failure of few-shot neural rendering
A simple frequency regularizer can address this challenge
FreeNeRF outperforms existing methods on multiple datasets
Future investigation could apply FreeNeRF to other problems with high-frequency noise
FreeNeRF produces smoother normal estimation which can facilitate applications with glossy surfaces
High-frequency inputs cause catastrophic failure
Trade-off between PSNR and LPIPS
Occlusion regularization penalizes near-camera dense fields
FreeNeRF achieves best results under “Average” metrics in most settings
DietNeRF renders “imaginary” components not in original images
RegNeRF fails to estimate accurate depth and suffers from near-camera floaters
FreeNeRF achieves reasonably well performance across a wide range of curriculum choices
Using low-frequency components as inputs enables mipNeRF to learn meaningful scene representations
With enough view information, a shorter frequency regularization enables NeRF models to render more high-frequency details
Aggressive occlusion results in incomplete white desks
Occlusion regularization does not solve remote floaters far from cameras
DietNeRF generates patches that do not closely match ground truth
FreeNeRF reconstructs scenes more in line with ground truth
FreeNeRF has negligible training overhead compared to baselines
Results show consistent improvement with occlusion regularization
Using predicted black & white color as additional prior improves results

Link to paper#

Abstract#

Paper Content#

Introduction#

Related work#

Method#

Frequency regularization#

Occlusion regularization#

Experiments#

Setups#

Comparison#

Ablation study#

Conclusion#

Link to paper

Abstract

Paper Content

Introduction

Related work

Method

Frequency regularization

Occlusion regularization

Experiments

Setups

Comparison

Ablation study

Conclusion