Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Novel view synthesis with sparse inputs is a challenging problem for neural radiance fields (NeRF).
- Frequency plays an important role in NeRF’s training.
- Two regularization terms are proposed to address the challenge.
- FreeNeRF achieves state-of-the-art performance across diverse datasets.
Paper Content
Introduction
- Neural Radiance Field (NeRF) can render high-fidelity novel views but struggles with few inputs.
- Existing methods address this challenge with transfer learning, depth-supervision, and patch-based regularization.
- FreeNeRF is a simple baseline that requires minimal modifications to NeRF and outperforms existing methods.
- FreeNeRF is dependency-free and overhead-free.
- Frequency regularization stabilizes the learning process and avoids overfitting.
- Occlusion regularization penalizes near-camera density fields.
Related work
- Neural fields use deep neural networks to represent 2D images or 3D scenes as continuous functions.
- NeRF requires hundreds of input images to learn high-quality scene representations.
- Few-shot Neural Rendering attempts to address the challenging few-shot neural rendering problem by leveraging extra information.
- Positional encoding lies at the heart of NeRF’s success.
- Frequency curriculum is used to tackle the few-shot neural rendering problem.
Method
- Hyperparameter L controls maximum encoded frequency
- Raw inputs are concatenated with frequency-encoded inputs
- Ray is cast from camera’s origin along direction to pass through pixel
- Color of ray is computed using quadrature of K sampled points
Frequency regularization
- Few-shot neural rendering is prone to overfitting
- NeRF learns 3D scene representations from 2D images
- Given few input views, NeRF is prone to overfitting
- High-frequency inputs exacerbate overfitting
- Removing high-frequency components avoids catastrophic failure
- Frequency regularization circumvents high-frequency signals
Occlusion regularization
- Frequency regularization does not solve all problems in few-shot neural rendering.
- Certain characteristic artifacts may still exist in novel views, such as “walls” or “floaters”.
- These failure patterns originate from the least overlapped regions in the training views.
- Occlusion regularization proposed to penalize dense fields near the camera.
Experiments
Setups
- Evaluated method on three datasets in few-shot settings
- Used PSNR, SSIM, and LPIPS scores as quantitative results
- Used geometric mean of MSE, SSIM, and LPIPS
- Implemented method on top of DietNeRF and RegNeRF codebases
- Set end iteration of frequency regularization to 90%, 70%, and 20%
- Used weight of 0.01 for L occ regularization
- Used M = 20 for LLFF and Blender, M = 10 for DTU, M = 15 for black/white colors in DTU
Comparison
- FreeNeRF outperforms state-of-the-art methods in terms of novel view synthesis quality and computation overhead
- FreeNeRF has higher PSNR and SSIM scores than other methods
- DietNeRF implicitly distills semantic information from a pre-trained CLIP model
- FreeNeRF outperforms transfer learning-based methods in terms of PSNR and SSIM scores
- FreeNeRF produces higher-quality results than RegNeRF
- FreeNeRF is a lightweight and efficient solution for few-shot neural rendering problems
- FreeNeRF does not require additional steps
- FreeNeRF suffers less from “floaters” than ReNeRF
Ablation study
- Used batch size of 1024 instead of 4096 for main experiments
- Investigated impact of frequency regularization duration
- 90% schedule best for PSNR score
- Trade-off between PSNR and LPIPS
- Occlusion regularization improves results
- Distortion loss worsens results
- Occlusion regularization can cause over-regularization
Conclusion
- We have presented FreeNeRF, a streamlined approach to few-shot neural rendering
- We studied the relation between input frequency and the failure of few-shot neural rendering
- A simple frequency regularizer can address this challenge
- FreeNeRF outperforms existing methods on multiple datasets
- Future investigation could apply FreeNeRF to other problems with high-frequency noise
- FreeNeRF produces smoother normal estimation which can facilitate applications with glossy surfaces
- High-frequency inputs cause catastrophic failure
- Trade-off between PSNR and LPIPS
- Occlusion regularization penalizes near-camera dense fields
- FreeNeRF achieves best results under “Average” metrics in most settings
- DietNeRF renders “imaginary” components not in original images
- RegNeRF fails to estimate accurate depth and suffers from near-camera floaters
- FreeNeRF achieves reasonably well performance across a wide range of curriculum choices
- Using low-frequency components as inputs enables mipNeRF to learn meaningful scene representations
- With enough view information, a shorter frequency regularization enables NeRF models to render more high-frequency details
- Aggressive occlusion results in incomplete white desks
- Occlusion regularization does not solve remote floaters far from cameras
- DietNeRF generates patches that do not closely match ground truth
- FreeNeRF reconstructs scenes more in line with ground truth
- FreeNeRF has negligible training overhead compared to baselines
- Results show consistent improvement with occlusion regularization
- Using predicted black & white color as additional prior improves results