Implemented methods
General usage

Implemented methods

Object-centric

Methods	Official / Un-official	Get started	Notes, major difference from paper, etc.
NeuS in minutes	Un-official	readme	- support object-centric datasets as well as indoor datasets - fast and stable convergence without needing mask - support using NGP / LoTD or MLPs as cr&dv representations - large pixel batch size (4096) & pixel error maps
NGP	Un-official		- support using NGP / LoTD or MLPs as representations

Street-view

Methods	Official / Un-official	Get started	Notes, major difference from paper, etc.
StreetSurf	Official	readme	- LiDAR loss improved (using L1 and discarding outliers)
NGP with LiDAR	Un-official	readme	- using Urban-NeRF's LiDAR loss

General usage

Dataset preparation
Training
Rendering
Appearance evaluation
LiDAR simulation
- > Simulate a single LiDAR
- > Simulate a demo of a list of LiDAR models
LiDAR evaluation
Mesh extraction
Occupancy grid extraction
- > Format of the extracted occupancy grid

Dataset preparation

[readme] ⬅️ NeuS's version of DTU / BlendedMVS Dataset
[readme] ⬅️ BlendedMVS Dataset
[readme] ⬅️ MonoSDF's version of Replica / scannet Dataset
[readme] ⬅️ Waymo Open Dataset - Perception

NOTE:

🏃 You can combine multiple subtasks listed below and automatically execute them one by one with run.py . For example:

python code_single/tools/run.py train,eval,eval_lidar,extract_mesh \
--config code_single/configs/xxx.yaml \
--eval.downscale=2 --eval_lidar.lidar_id=lidar_TOP \
--extract_mesh.to_world --extract_mesh.res=0.1

--config or --resume_dir are common args shared across all subtasks.

📌 All the instructions below assume you have already cd into /path/to/neuralsim .

Training

> Start a fresh new training

python code_single/tools/train.py --config code_single/configs/xxx.yaml

⚙️ You can specify temporary configs via command line args like --aaa.bbb=ccc, which will temporarily modify the aaa:bbb field in xxx.yaml in this run. For more details on how the command line and yaml configuration work, please refer to this doc .

> Resume a previous experiment

python code_single/tools/train.py --resume_dir /path/to/logs/xxx

You can also resume a experiment if everything is not changed in the original config yaml by directly specifying --config code_single/configs/xxx.yaml .

> Rich training logs

We provide rich logging information with tensorboard.

Check them out by

tensorboard --logdir /path/to/logs/xxx

The logging frequency of scalars is controlled by training:i_log field. (how many iterations per log entry).

The logging frequency of images (visualization or renderings) is controlled by training:i_val field.

> Optional DDP training

➡️ Single node multi GPUs

Taking an example of a single machine with 4 GPUs:

📌 NOTE: You only need to add the --ddp option to the command line arguments of train.py.

python -m torch.distributed.launch --nproc_per_node=4 \
code_single/tools/train.py \
--config code_single/configs/waymo/streetsurf/withmask_withlidar_joint.240219.yaml \
--ddp

In the above example, if everything works properly, you will see the following message printed four times with differen ranks in the logs:

=> Enter init_process_group(): 
	=> rank=0
	=> world_size=4
	=> local_rank=0
	=> master_addr=127.0.0.1
	=> master_port=29500
...
=> Done init Env @ DDP: 
	=> device_ids set to [0]
	=> rank=0
	=> world_size=4
	=> local_rank=0
	=> master_addr=127.0.0.1
	=> master_port=29500
...

➡️ Multi nodes multi GPUs

Taking an example of a 2 nodes with 4 GPUs each (i.e. 8 GPUs in total):

📌 NOTE: You only need to add the --ddp option to the command line arguments of train.py.

python -m torch.distributed.launch --nnodes=$WORLD_SIZE --nproc_per_node=4 \
--master_addr=$MASTER_ADDR --master_port=$MASTER_PORT --node_rank=$RANK \
code_single/tools/train.py \
--config code_single/configs/waymo/streetsurf/withmask_withlidar_joint.240219.yaml \
--ddp

In the above example, if everything works properly, you will see the following message printed four times with differen ranks in the logs of the master node:

=> Enter init_process_group(): 
	=> rank=0
	=> world_size=8
	=> local_rank=0
	=> master_addr=dlcfevx8ltikuljg-master-0
	=> master_port=23456
...
=> Done init Env @ DDP: 
	=> device_ids set to [0]
	=> rank=0
	=> world_size=8
	=> local_rank=0
	=> master_addr=dlcfevx8ltikuljg-master-0
	=> master_port=23456
...

As for the worker node's print logs:

=> Enter init_process_group(): 
	=> rank=4
	=> world_size=8
	=> local_rank=0
	=> master_addr=dlcfevx8ltikuljg-master-0
	=> master_port=23456
...
=> Done init Env @ DDP: 
	=> device_ids set to [0]
	=> rank=4
	=> world_size=8
	=> local_rank=0
	=> master_addr=dlcfevx8ltikuljg-master-0
	=> master_port=23456
...

> Debug training errors

We also provide a primitive debugging tool for checking gradients. You can try it out by modifying self.debug_grad=True in the Trainer class. Note that this will significantly slow down training and should be used along with debugpy or other tools.

Rendering

The tools/render.py works in two modes, namely replay or NVS (novel_view_synthesis) mode. Both modes support additional LiDAR simulation or mesh visualization along with rgb, depth and surface normals rendering.

> Replay

By default, tools/render.py runs in replay mode, which will render frames between the optionally given --start_frame and --stop_frame parameter with everything untouched.

python code_single/tools/render.py --resume_dir /path/to/logs/xxx --downscale=1 \
--assetbank_cfg.Main.model_params.ray_query_cfg.query_param.num_coarse=0

NOTE:

For street-view, rendering full size videos often consumes a lot of time. It is recommended to specify --downscale=2 or larger values.
Usually, ignoring num_coarse samples will not significantly affect the results and will speed up rendering.
- For StreetSurf, simply add --assetbank_cfg.Street.model_params.ray_query_cfg.query_param.num_coarse=0
- For other single object datasets, simply add --assetbank_cfg.Main.model_params.ray_query_cfg.query_param.num_coarse=0
Many other options can be specified while rendering, including --no_sky, --only_cr, --fps, --rayschunk etc. Check out tools/render.py for more details.

> NVS

By giving --nvs_path=... etc. to specify the type of the novel camera trajectory and other configs, tools/render.py runs in NVS mode.

--nvs_node_id is used to specify the scene graph node whose trajectory you wish to manipulate. Typically, for single-object datasets, this node is camera. For street-view datasets, it's ego_car.

➡️ Example for single-object NVS

python code_single/tools/render.py --resume_dir logs/bmvs/5c0d13 \
--nvs_path=spherical_spiral --nvs_node_id=camera --nvs_param=48,29,54 \
--nvs_num_frames=120 --downscale=1 \
--assetbank_cfg.Main.model_params.ray_query_cfg.query_param.num_coarse=0

➡️ Example for street-view NVS

NOTE: --start_frame and --stop_frame in this case specifies the reference frames for the camera path creation method. The real length of the NVS path is specified by --nvs_num_frames.

python code_single/tools/render.py --resume_dir logs/streetsurf/seg100613 \
--nvs_path=street_view --nvs_node_id=ego_car --nvs_param=2.0,1.0,3.0,0.0,2.0,-2.0 \
--nvs_num_frames=120 --start_frame=80 --stop_frame=160 --downscale=4 \
--assetbank_cfg.Street.model_params.ray_query_cfg.query_param.num_coarse=0

> With mesh visualization

An example (click to expand)

To visualize a specific mesh from the perspective of the original cameras when rendering, additionally specify --render_mesh=xxx.ply :

python code_single/tools/render.py --resume_dir /path/to/logs/xxx \
--downscale=4 --render_mesh /path/to/logs/xxx/meshes/xxx.ply \
--render_mesh_transform=identity \
--assetbank_cfg.Main.model_params.ray_query_cfg.query_param.num_coarse=0

NOTE: If the input mesh is already in world coordinates (e.g. --to_world is specified when extracting mesh), --render_mesh_transform should just be identity. If the input mesh is in object coordinates, --render_mesh_transform should be to_world.

Appearance Evaluation

This is similar to the replay mode in tools/render.py, but with additional calculations and limitations for evaluation.

python code_single/tools/eval.py --resume_dir /path/to/logs/xxx

LiDAR simulation

> Simulate a single LiDAR

For example, to simulate the original LiDAR model:

python code_single/toos/render.py --resume_dir /path/to/logs/xxx \
--no_cam --render_lidar --lidar_model=original_reren --lidar_id=lidar_TOP

➡️ A visualization window can be popped up by additionally specifying --lidar_vis_verbose.

You can also try this out when rendering in NVS mode.

> Simulate a demo of a list of LiDAR models

In addition to the original LiDAR, numerous other real-world LiDAR models can be simulated.

➡️ Below is a script that sequentially simulates a list of LiDAR models:

bash code_single/tools/demo_lidar_sim.sh /path/to/logs/xxx --lidar_vis_width=1200

➡️ A visualization window can be popped up by additionally specifying --lidar_vis_verbose.

LiDAR evaluation

python code_single/tools/eval_lidar.py --resume_dir /path/to/logs/xxx \
--lidar_id=lidar_TOP --dirname=eval_lidar_TOP

➡️ A visualization video like the one in StreetSurf website can be produced by additionally specifying --video_backend=vedo:

python code_single/tools/eval_lidar.py --resume_dir /path/to/logs/xxx \
--lidar_id=lidar_TOP --dirname=eval_lidar_TOP --video_backend=vedo

➡️ A visualization window can be popped up by additionally specifying --video_verbose.

Mesh extraction

To extract mesh of a specific experiment:

➡️ For SDF networks:

python code_single/tools/extract_mesh.py --resume_dir /path/to/logs/xxx \
--to_world --res=0.1

➡️ For NeRF networks: (you can specify other sigma threshold with --levelset= )

python code_single/tools/extract_mesh.py --resume_dir /path/to/logs/xxx \
--to_world --res=0.1 --network_type=nerf --levelset=1.0

Occupancy grid extraction

python code_single/tools/extract_occgrid.py --resume_dir /path/to/logs/xxx \
--occ_res=0.1

A visualization window can be popped up when the extraction is finished by additionally specifying --verbose (:warning: Might run out of CPU mem if occ_res is small and the resulting resolution is large).

> Format of the extracted occupancy grid

We opt to store the actual occupied integer coordinates rather than a full-resolution 3D boolean grid to save space.

The output file is in .npz format, containing occupied vertex coordinates and meta information.

Below is a description and example of how to read the file:

import numpy as np
datadict = np.load("xxx.npz", allow_pickle=True)
datadict['occ_corners'] # [N, 3], int16, integer coordinates of the actual occupied grid points, where N represents the number of actual occupied grids
datadict['sidelength'] # [res_x, res_y, res_z], int, integer side lengths allocated in x, y, z directions respectively
datadict['occ_res'] # float, default 0.1, resolution setting when extracting occupied grid, i.e., the side length of each cubic grid
datadict['coord_min'] # [3,], float, world coordinates corresponding to the vertex at the front-left-bottom corner (the vertex with smaller values in x, y, z directions) of the integer coordinate [0,0,0] grid
datadict['coord_offset'] # [3,], float, offset between the world coordinate system definition and the world coordinate system definition of the original data sequence (original_world=current_world+coord_offset)
datadict['meta'] # dict, a dictionary containing the meta information of the scene
datadict['meta']['scene_id'] # str, full name id of the current sequence "segment-xxxxx-with_camera_labels"
datadict['meta']['start_frame'] # int, the start frame defined during the training of the current sequence
datadict['meta']['num_frames'] # int, the total number of frames defined during the training of the current sequence (end frame = start frame + total number of frames)

# To read:
voxel_coords_in_world = datadict['occ_corners'].astype(float) * datadict['occ_res'] + datadict['coord_min']
voxel_coords_in_data_world = voxel_coords_in_world + datadict['coord_offset']

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!