NeRF: Neural Radiance Field

Efficient and comprehensive pytorch implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis from Mildenhall et al. 2020.

Table of Content

Installation
Quickstart
Description
Implemenation
- Details
- Results
Citation

Installation

This implementation has been tested on Ubuntu 20.04 with Python 3.8, and torch 1.9. Install required package first pip3 install -r requirements.txt. You may use pyenv or conda to avoid confilcts with your environement.

Download the Blender Scenes Dataset. Rename it and place it in the repo as data/blender (ingored by default).

data/
└── blender
    ├── chair
    ├── drums
    ├── ficus
    ├── hotdog
    ├── lego
    ├── materials
    ├── mic
    └── ship

Quickstart

Command Line

Action	Command
Train	`python3 -m nerf.train`
Inference	`python3 -m nerf.infer`
Distillation	`python3 -m nerf.distill`
Benchmark	`python3 -m nerf.bench`

Reproduction

Action	Command
Train	`make train`
Distillation	`make distill`
Hybrid	`make hybrid`
Benchmark	`make bench_all`

Manual

# ==== Imports
import nerf.infer         # Enables inference features (NeRF.infer)
import nerf.train         # Enables training features (NeRF.fit)

from nerf.core import BoundedVolumeRaymarcher as BVR, NeRF
from nerf.core import PositionalEncoding as PE
from nerf.core import NeRFScheduler
from nerf.data import BlenderDataset


DEVICE = "cuda:0"

# ==== Setup
dataset = BlenderDataset("./data/blender", scene="hotdog", split="train")

phi_x = PE(3, 6)
phi_d = PE(3, 6)

nerf = NeRF(phi_x, phi_d, width=256, depth=4).to(DEVICE)
raymarcher = BVR(tn=2., tf=6., samples_c=64, samples_f=64)

# ==== Train
history = nerf.fit(
    nerf,                 # NeRF Module
    raymarcher,           # Raymarcher (BVR)
    optim,                # Optimizer (Adam, AdamW, ...)
    scheduler,            # NeRFScheduler
    criterion,            # Criterion (MSELoss, L1Loss, ...)
    scaler,               # GradScaler (torch.cuda.amp, can be disabled)
    dataset: Dataset,     # Dataset (BlenderDataset)
)                         # More options available (epochs, batch_size, ...)

# ==== Infer
frame = nerf.infer(
    coarse,               # coarse NeRF Module
    fine,                 # fine NeRF Module
    raymarcher,           # Raymarcher (BVR)
    ro,                   # Rays Origin (Tensor of size (B, 3))
    rd,                   # Rays Direction (Tensor of size (B, 3))
    W,                    # Frame Width
    H,                    # Frame Height
)                         # More options available (epochs, batch_size, ...)

Description

NeRF uses both advances in Computer Graphics and Deep Learning research.

The method allows encoding a 3D scene as a continuous volume described by density and color at any point in a given bounded volume. During raymarching, the rays query the volume representation model to obtain intersection data. It is trained in an end-to-end fashion and uses only the ground truth images as an objective signal. A first network, the coarse model, is trained using voxel grid sampling to increase sample efficiency. This first pass is used to trained a second network, the fine network, using importance sampling of the volume.

The networks are tied to one unique scene. Caching and acceleration structures can be used to decrease rendering time during inference. The same models can be used to generate a depth map and a 3D mesh of the scene.

Positional Encoding

Fourier Features In their original work, Midenhall et al. presented the use of positional encoding to allow the network to learn high-frequency functions which clasical multilayer perceptron without positiona encoding are not able to and focus only on low-frequency reconstruction.

v = xy | xyz                      # normalized to [-1; 1]

rgb = lambda v: mlp(v)            # wo/ pe-encoding
rgb = lambda v: mlp(phi(v))       # w/  pe-encoding

phi = lambda v: [
  cos(2 ** 0 * PI * v),
  sin(2 ** 0 * PI * v),
  cos(2 ** 1 * PI * v),
  sin(2 ** 1 * PI * v),
  ...
].T

Fourier Features In Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains, Tancik et al 2020, NeRF authors have shown that encoding positions using fourier feature mapping enables multilayer perceptron to learn high-frequency functions in low dimensional problem domains.

v = xy | xyz                      # normalized to [-1; 1]

rgb = lambda v: mlp(v)            # wo/ ff-encoding
rgb = lambda v: mlp(phi(v))       # w/  ff-encoding

phi = lambda v: [
  a_0 * cos(2 * PI * b_0.T * v),
  a_0 * sin(2 * PI * b_0.T * v),
  a_1 * cos(2 * PI * b_1.T * v),
  a_1 * sin(2 * PI * b_1.T * v),
  ...
].T

Implicit Representation

The scene is encoded by feating a simple multilayer perceptron architecture on density sigma and color RGB given position x and direction d queries.

Original Architecture

n = 4

           ReLU    ReLU    
phi(x) --> 256 --> 256 --> ReLU(sigma)
  60    |   n   ^   n  |
        |       |      |           ReLU
        -- cat --      --> 256 --> 128 --> Sigmoid(RGB)
                                ^
                                |
                               cat
                                |
                              phi(d)
                                24

Volume Rendering

Volume raymarching is used to produce the final rendering. Each ray is thrown from the camera origin to each pixel and sampled N_c times for the coarse model and N_f times for the fine model between a given bounded volume delimited by the near t_n and far t_f camera frustum parameters.

Rendering Equation

N_c, N_f = 64, 128

alpha_i = (1 - exp(-sigma_i * delta_i))
T_i = cumprod(1 - alpha_i)
w_i = T_i * alpha_i
C_c = sum(w_i * c_i)

In this equation, w_i respresents a piecewise-constant PDF along the ray, T_i the amount of light blocked before reaching segment t_i, delta_i the segment length dist(t_i-1, t_i), and c_i the color of the ray intersection at t_i.

The weights w_i are reused for inverse transform sampling for the fine pass. A total of N_c + N_f is finally used to generate the last render, this time querying the coarse model instead.

Implementation

Details

Feature	Reference
Fourier Featrure Encoding
Positional Encoding
Neural Radiance Field Model
Bounded Volume Raymarcher
Noise for Continuous Representation
Camera Paths (Turnaround, ...)
Interactive Notebook
Reptile Meta-Learning	Tanick et al., Nichol et al.
Shifted Softplus for Sigma	Barron et al.
Widened Sigmoid for RGB	Barron et al.
Fine Network (Differs from Original, No second Network)
Training Opitmizations	Nvidia's PyTorch Performance Tuning Guide
Safe Sofplus, Sigmoid	Blog Article by Jia Fu Low
Gradient Clipping
NeRF/JAX-NeRF Warmup Decay Leanring Rate Scheduler	Barron et al.
Log Decay Leanring Rate Scheduler

Results

Scene	Ground Truth	NeRF RGB Map	NeRF Depth Map
Chair
Lego
HotDog
Drums
Mic
Materials
Ficus
Ship

Using 64 Coarse Samples, 64 Fine Samples at 400x400 Resolution

Coarse	Fine	Seconds	FPS
NeRF	NeRF	1.91	0.52
DistillNeRF	NeRF	1.37	0.73
DistillNeRF	DistillNeRF	0.30	3.36

Citation

Original Work

@inproceedings{mildenhall2020nerf,
  title={NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis},
  author={Ben Mildenhall and Pratul P. Srinivasan and Matthew Tancik and Jonathan T. Barron and Ravi Ramamoorthi and Ren Ng},
  year={2020},
  booktitle={ECCV},
}

Name		Name	Last commit message	Last commit date
Latest commit History 147 Commits
docs/imgs		docs/imgs
nerf		nerf
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeRF: Neural Radiance Field

Installation

Quickstart

Description

Positional Encoding

Implicit Representation

Volume Rendering

Implementation

Citation

About

Releases

Packages

Languages

yliess86/NeRF

Folders and files

Latest commit

History

Repository files navigation

NeRF: Neural Radiance Field

Installation

Quickstart

Description

Positional Encoding

Implicit Representation

Volume Rendering

Implementation

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages