NeRF (Neural Radiance Fields) is a method that achieves state-of-the-art results for synthesizing novel views of complex scenes. Here are some videos generated by this repository (pre-trained models are provided below):
This project is a faithful PyTorch implementation of NeRF that reproduces the results while running 1.3 times faster. The code is based on authors' Tensorflow implementation here, and has been tested to match it numerically.
Download Lego dataset: http://cseweb.ucsd.edu/~viscomp/projects/LF/papers/ECCV20/nerf/nerf_example_data.zip
git clone https://github.com/Henrik-JIA/NeRF-Pytorch-Control-Positional-Encoding.git
cd NeRF-Pytorch-Control-Positional-Encoding
pip install -r requirements.txt
Dependencies (click to expand)
- PyTorch 1.4
- matplotlib
- numpy
- imageio
- imageio-ffmpeg
- configargparse
- open3d
The LLFF data loader requires ImageMagick.
You will also need the LLFF code (and COLMAP) set up to compute poses if you want to run on your own real data.
Download data for two example datasets: lego
and fern
bash download_example_data.sh
To train a low-res lego
NeRF:
python run_nerf.py --config configs/lego.txt
After training for 100k iterations (~4 hours on a single 2080 Ti), you can find the following video at logs/lego_test/lego_test_spiral_100000_rgb.mp4
.
Show the poses and images of the dataset:
# Visualizing data poses
parser.add_argument("--visualize_poses", type=bool, default=False,
help='visualize poses for blender dataset')
# Visualizing data images
parser.add_argument("--visualize_imgs", type=bool, default=False,
help='visualize images for blender dataset')
Poses | Images |
---|---|
Comparison Positional Encoding: Compare the situation before and after Positional Enabling. In this project, you can choose whether to use positional encoding to enhance the representation of input coordinates. The relevant configuration parameters are as follows:
--use_embedder
: Whether to use positional encoding. Default isTrue
.--save_embed_result
: Whether to save the results of positional encoding. Default isTrue
.--save_embed_iter
: The iteration at which to save the positional encoding results. Default is500
.
The following is the parameter setting to enable position encoding and save the comparison with the original image after 500 iterations:
parser.add_argument("--use_embedder", type=bool, default=True,
help='use positional encoding for input coordinates')
parser.add_argument("--save_embed_result", type=bool, default=True,
help='save positional encoding results')
parser.add_argument("--save_embed_iter", type=int, default=500,
help='iteration to save positional encoding results')
If we do not save the results, we can set --save_embed_result
to false to not save it during the entire training process. Only when --save_embed_result
is set to True, the position encoding result will be compared with the original image for the number of iterations set by --save_embed_iter
, and the result will be saved.
Without positional encoding:
If positional encoding is enabled and the save_embed_result
parameter is set to True
, the positional encoding results are saved at the specified iteration (save_embed_iter
). The following files are generated:
args.txt
: file contains all the training parameters, making it easy to record and reproduce experiments. It is saved in the formatlogs/{expname}/args.txt
.embedded_coords_with_encoding.npy
: The coordinates after applying positional encoding.embedded_coords_without_encoding.npy
: The original coordinates without positional encoding.original_coords.png
: A plot of the original coordinates.position_encoding_with.png
: A plot of the coordinates with positional encoding.position_encoding_without.png
: A plot of the coordinates without positional encoding.comparison_with_encoding.png
(--use_embedder
is True and--save_embed_result
is True): A comparison plot of the ground truth image and the model output with positional encoding.comparison_without_encoding.png
(--use_embedder
is False and--save_embed_result
is True): A comparison plot of the ground truth image and the model output without positional encoding.
To train a low-res fern
NeRF:
python run_nerf.py --config configs/fern.txt
After training for 200k iterations (~8 hours on a single 2080 Ti), you can find the following video at logs/fern_test/fern_test_spiral_200000_rgb.mp4
and logs/fern_test/fern_test_spiral_200000_disp.mp4
To play with other scenes presented in the paper, download the data here. Place the downloaded dataset according to the following directory structure:
├── configs
│ ├── ...
│
├── data
│ ├── nerf_llff_data
│ │ └── fern
│ │ └── flower # downloaded llff dataset
│ │ └── horns # downloaded llff dataset
| | └── ...
| ├── nerf_synthetic
| | └── lego
| | └── ship # downloaded synthetic dataset
| | └── ...
To train NeRF on different datasets:
python run_nerf.py --config configs/{DATASET}.txt
replace {DATASET}
with trex
| horns
| flower
| fortress
| lego
| etc.
To test NeRF trained on different datasets:
python run_nerf.py --config configs/{DATASET}.txt --render_only
replace {DATASET}
with trex
| horns
| flower
| fortress
| lego
| etc.
You can download the pre-trained models here. Place the downloaded directory in ./logs
in order to test it later. See the following directory structure for an example:
├── logs
│ ├── fern_test
│ ├── flower_test # downloaded logs
│ ├── trex_test # downloaded logs
Tests that ensure the results of all functions and training loop match the official implentation are contained in a different branch reproduce
. One can check it out and run the tests:
git checkout reproduce
py.test
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Ben Mildenhall*1,
Pratul P. Srinivasan*1,
Matthew Tancik*1,
Jonathan T. Barron2,
Ravi Ramamoorthi3,
Ren Ng1
1UC Berkeley, 2Google Research, 3UC San Diego
*denotes equal contribution
A neural radiance field is a simple fully connected network (weights are ~5MB) trained to reproduce input views of a single scene using a rendering loss. The network directly maps from spatial location and viewing direction (5D input) to color and opacity (4D output), acting as the "volume" so we can use volume rendering to differentiably render new views
Kudos to the authors for their amazing results:
@misc{mildenhall2020nerf,
title={NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis},
author={Ben Mildenhall and Pratul P. Srinivasan and Matthew Tancik and Jonathan T. Barron and Ravi Ramamoorthi and Ren Ng},
year={2020},
eprint={2003.08934},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
However, if you find this implementation or pre-trained models helpful, please consider to cite:
@misc{lin2020nerfpytorch,
title={NeRF-pytorch},
author={Yen-Chen, Lin},
publisher = {GitHub},
journal = {GitHub repository},
howpublished={\url{https://github.com/yenchenlin/nerf-pytorch/}},
year={2020}
}