Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Novel view synthesis with sparse inputs is a challenging problem for neural radiance fields (NeRF).
  • Frequency plays an important role in NeRF’s training.
  • Two regularization terms are proposed to address the challenge.
  • FreeNeRF achieves state-of-the-art performance across diverse datasets.

Paper Content

Introduction

  • Neural Radiance Field (NeRF) can render high-fidelity novel views but struggles with few inputs.
  • Existing methods address this challenge with transfer learning, depth-supervision, and patch-based regularization.
  • FreeNeRF is a simple baseline that requires minimal modifications to NeRF and outperforms existing methods.
  • FreeNeRF is dependency-free and overhead-free.
  • Frequency regularization stabilizes the learning process and avoids overfitting.
  • Occlusion regularization penalizes near-camera density fields.
  • Neural fields use deep neural networks to represent 2D images or 3D scenes as continuous functions.
  • NeRF requires hundreds of input images to learn high-quality scene representations.
  • Few-shot Neural Rendering attempts to address the challenging few-shot neural rendering problem by leveraging extra information.
  • Positional encoding lies at the heart of NeRF’s success.
  • Frequency curriculum is used to tackle the few-shot neural rendering problem.

Method

  • Hyperparameter L controls maximum encoded frequency
  • Raw inputs are concatenated with frequency-encoded inputs
  • Ray is cast from camera’s origin along direction to pass through pixel
  • Color of ray is computed using quadrature of K sampled points

Frequency regularization

  • Few-shot neural rendering is prone to overfitting
  • NeRF learns 3D scene representations from 2D images
  • Given few input views, NeRF is prone to overfitting
  • High-frequency inputs exacerbate overfitting
  • Removing high-frequency components avoids catastrophic failure
  • Frequency regularization circumvents high-frequency signals

Occlusion regularization

  • Frequency regularization does not solve all problems in few-shot neural rendering.
  • Certain characteristic artifacts may still exist in novel views, such as “walls” or “floaters”.
  • These failure patterns originate from the least overlapped regions in the training views.
  • Occlusion regularization proposed to penalize dense fields near the camera.

Experiments

Setups

  • Evaluated method on three datasets in few-shot settings
  • Used PSNR, SSIM, and LPIPS scores as quantitative results
  • Used geometric mean of MSE, SSIM, and LPIPS
  • Implemented method on top of DietNeRF and RegNeRF codebases
  • Set end iteration of frequency regularization to 90%, 70%, and 20%
  • Used weight of 0.01 for L occ regularization
  • Used M = 20 for LLFF and Blender, M = 10 for DTU, M = 15 for black/white colors in DTU

Comparison

  • FreeNeRF outperforms state-of-the-art methods in terms of novel view synthesis quality and computation overhead
  • FreeNeRF has higher PSNR and SSIM scores than other methods
  • DietNeRF implicitly distills semantic information from a pre-trained CLIP model
  • FreeNeRF outperforms transfer learning-based methods in terms of PSNR and SSIM scores
  • FreeNeRF produces higher-quality results than RegNeRF
  • FreeNeRF is a lightweight and efficient solution for few-shot neural rendering problems
  • FreeNeRF does not require additional steps
  • FreeNeRF suffers less from “floaters” than ReNeRF

Ablation study

  • Used batch size of 1024 instead of 4096 for main experiments
  • Investigated impact of frequency regularization duration
  • 90% schedule best for PSNR score
  • Trade-off between PSNR and LPIPS
  • Occlusion regularization improves results
  • Distortion loss worsens results
  • Occlusion regularization can cause over-regularization

Conclusion

  • We have presented FreeNeRF, a streamlined approach to few-shot neural rendering
  • We studied the relation between input frequency and the failure of few-shot neural rendering
  • A simple frequency regularizer can address this challenge
  • FreeNeRF outperforms existing methods on multiple datasets
  • Future investigation could apply FreeNeRF to other problems with high-frequency noise
  • FreeNeRF produces smoother normal estimation which can facilitate applications with glossy surfaces
  • High-frequency inputs cause catastrophic failure
  • Trade-off between PSNR and LPIPS
  • Occlusion regularization penalizes near-camera dense fields
  • FreeNeRF achieves best results under “Average” metrics in most settings
  • DietNeRF renders “imaginary” components not in original images
  • RegNeRF fails to estimate accurate depth and suffers from near-camera floaters
  • FreeNeRF achieves reasonably well performance across a wide range of curriculum choices
  • Using low-frequency components as inputs enables mipNeRF to learn meaningful scene representations
  • With enough view information, a shorter frequency regularization enables NeRF models to render more high-frequency details
  • Aggressive occlusion results in incomplete white desks
  • Occlusion regularization does not solve remote floaters far from cameras
  • DietNeRF generates patches that do not closely match ground truth
  • FreeNeRF reconstructs scenes more in line with ground truth
  • FreeNeRF has negligible training overhead compared to baselines
  • Results show consistent improvement with occlusion regularization
  • Using predicted black & white color as additional prior improves results