The VR-NeRF Eyeful Tower Dataset

Linning Xu · Vasu Agrawal · William Laney · Tony Garcia · Aayush Bansal
Changil Kim · Samuel Rota Bulò · Lorenzo Porzi · Peter Kontschieder
Aljaž Božič · Dahua Lin · Michael Zollhöfer · Christian Richardt

ACM SIGGRAPH Asia 2023

Dataset Overview

Scene	ver	cams	pos	img	2K EXRs	1K EXRs	8K+ JPEGs	4K JPEGs	2K JPEGs	1K JPEGs
apartment	v2	22	180	3,960	123 GB	31 GB	92 GB	20 GB	5 GB	1.2 GB
kitchen	v2*	19	318	6,024	190 GB	48 GB	142 GB	29 GB	8 GB	1.9 GB
office1a	v1	9	85	765	24 GB	6 GB	15 GB	3 GB	1 GB	0.2 GB
office1b	v2	22	71	1,562	49 GB	13 GB	35 GB	7 GB	2 GB	0.4 GB
office2	v1	9	233	2,097	66 GB	17 GB	46 GB	9 GB	2 GB	0.5 GB
office_view1	v2	22	126	2,772	87 GB	22 GB	63 GB	14 GB	4 GB	0.8 GB
office_view2	v2	22	67	1,474	47 GB	12 GB	34 GB	7 GB	2 GB	0.5 GB
riverview	v2	22	48	1,008	34 GB	8 GB	24 GB	5 GB	2 GB	0.4 GB
seating_area	v1	9	168	1,512	48 GB	12 GB	36 GB	8 GB	2 GB	0.5 GB
table	v1	9	134	1,206	38 GB	9 GB	26 GB	6 GB	2 GB	0.4 GB
workshop	v1	9	700	6,300	198 GB	50 GB	123 GB	27 GB	8 GB	2.1 GB
raf_emptyroom	v2	22	365	8,030	252 GB	63 GB	213 GB	45 GB	12 GB	2.5 GB
raf_furnishedroom	v2	22	154	3,388	106 GB	27 GB	90 GB	19 GB	5 GB	1.1 GB
Total					1,262 GB	318 GB	939 GB	199 GB	54 GB	12.5 GB

* v2 with 3 fewer cameras than standard configuration, i.e. only 19 cameras.

Example Images

apartment	kitchen	office1a	office1b

office2	office_view1	office_view2	riverview

seating_area	table	workshop

April 2024: The following two scenes accompany our paper Real Acoustic Fields (CVPR 2024):

raf_emptyroom	raf_furnishedroom

Capture Rig

All images in the dataset were taken with either Eyeful Tower v1 or v2 (as specified in the overview table). Eyeful Tower v1 comprises 9 fisheye cameras, whereas Eyeful Tower v2 comprises 22 pinhole cameras (19 for “kitchen”).

Download instructions

The Eyeful Tower dataset is hosted on AWS S3, and can be explored with any browser or downloaded with standard software, such as wget or curl.

However, for the fastest, most reliable download, we recommend using the AWS command line interface (AWS CLI), see AWS CLI installation instructions.

Optional: Speed up downloading by increasing the number of concurrent downloads from 10 to 100:

aws configure set default.s3.max_concurrent_requests 100

Download a single scene (1K JPEGs only)

aws s3 cp --recursive --no-sign-request s3://fb-baas-f32eacb9-8abb-11eb-b2b8-4857dd089e15/EyefulTower/apartment/images-jpeg-1k/ apartment/images-jpeg-1k/

Alternatively, use “sync” to avoid transferring existing files:

aws s3 sync --no-sign-request s3://fb-baas-f32eacb9-8abb-11eb-b2b8-4857dd089e15/EyefulTower/apartment/images-jpeg-1k/ apartment/images-jpeg-1k/

For those interested in experimenting with specific cameras, we recommend viewing the collage video first. This will help you identify which camera views you'd like to utilize. For example, for this apartment scene using the v2 capture rig, you might consider camera IDs 19, 20, 21 which are placed at the same height.

Download all scenes (1K JPEGs only) using bash — 9 GB

for dataset in apartment kitchen office1a office1b office2 office_view1 office_view2 riverview seating_area table workshop; do
  mkdir -p $dataset/images-jpeg-1k;
  aws s3 cp --recursive --no-sign-request s3://fb-baas-f32eacb9-8abb-11eb-b2b8-4857dd089e15/EyefulTower/$dataset/images-jpeg-1k/ $dataset/images-jpeg-1k/;
done

Download the entire Eyeful Tower dataset — 3.5 TB

aws s3 sync --no-sign-request s3://fb-baas-f32eacb9-8abb-11eb-b2b8-4857dd089e15/EyefulTower/ .

Data Organization

Each scene is organized following this structure:

apartment
│
├── apartment-final.pdf      # Metashape reconstruction report
├── cameras.json             # Camera poses in KRT format (see below)
├── cameras.xml              # Camera poses exported from Metashape
├── colmap                   # COLMAP reconstruction exported from Metashape
│   ├── images               # Undistorted images (full resolution)
│   ├── images_2             # Undistorted images (1/2 resolution)
│   ├── images_4             # Undistorted images (1/4 resolution)
│   ├── images_8             # Undistorted images (1/8 resolution)
│   └── sparse               # COLMAP reconstruction (for full-res images)
├── images-1k                # HDR images at 1K resolution
│   ├── 10                   # First camera (bottom-most camera)
│   │   ├── 10_DSC0001.exr   # First image
│   │   ├── 10_DSC0010.exr   # Second image
│   │   ├── [...]            # More images
│   │   └── 10_DSC1666.exr   # Last image
│   ├── 11                   # Second camera
│   │   ├── 11_DSC0001.exr
│   │   ├── 11_DSC0010.exr
│   │   ├── [...]
│   │   └── 11_DSC1666.exr
│   ├── [...]                # More cameras
│   └── 31                   # Last camera (top of tower)
│       ├── 31_DSC0001.exr
│       ├── 31_DSC0010.exr
│       ├── [...]
│       └── 31_DSC1666.exr
├── images-2k [...]          # HDR images at 2K resolution
├── images-jpeg [...]        # Full-resolution JPEG images
├── images-jpeg-1k [...]     # JPEG images at 1K resolution
├── images-jpeg-2k           # JPEG images at 2K resolution
│   ├── [10 ... 31]
│   ├── [10 ... 31].mp4      # Camera visualization
│   └── collage.mp4          # Collage of all cameras
├── images-jpeg-4k [...]     # JPEG images at 4K resolution
├── mesh.jpg                 # Mesh texture (16K×16K)
├── mesh.mtl                 # Mesh material file
├── mesh.obj                 # Mesh in OBJ format
└── splits.json              # Training/testing splits

HDR images (`images-1k/{camera}/.exr` and `images-2k/{camera}/.exr`)

High dynamic range images merged from 9-photo raw exposure brackets.
Downsampled to “1K” (684×1024 pixels) or “2K” resolution (1368×2048 pixels).
Color space: DCI-P3 (linear)
Stored as EXR images with uncompressed 32-bit floating-point numbers.
All image filenames are prefixed with the camera name, e.g. 17_DSC0316.exr.
Images with filenames ending in the same number are captured at the same time.
Some images may be missing, e.g. due to blurry images or images showing the capture operator that were removed.

Example code: reading EXR images to create JPEGs

import os, cv2, numpy as np

# Enable OpenEXR support in OpenCV (https://github.com/opencv/opencv/issues/21326).
# This environment variable needs to be defined before the first EXR image is opened.
os.environ["OPENCV_IO_ENABLE_OPENEXR"] = "1"

# Read an EXR image using OpenCV.
img = cv2.imread("apartment/images-2k/17/17_DSC0316.exr", cv2.IMREAD_UNCHANGED)

# Apply white-balance scaling (Note: OpenCV uses BGR colors).
coeffs = np.array([0.726097, 1.0, 1.741252])  # apartment [RGB]
img = np.einsum("ijk,k->ijk", img, coeffs[::-1])

# Tonemap using sRGB curve.
linear_part = 12.92 * img
exp_part = 1.055 * (np.maximum(img, 0.0) ** (1 / 2.4)) - 0.055
img = np.where(img <= 0.0031308, linear_part, exp_part)

# Write resulting image as JPEG.
img = np.clip(255 * img, 0.0, 255.0).astype(np.uint8)
cv2.imwrite("apartment-17_DSC0316.jpg", img, params=[cv2.IMWRITE_JPEG_QUALITY, 100])

JPEG images (`images-jpeg/{camera}/.jpg`)

We provide JPEG images at four resolution levels:
1. images-jpeg/: 5784 × 8660 = 50. megapixels — full original image resolution
2. images-jpeg-4k/: 2736 × 4096 = 11.2 megapixels
3. images-jpeg-2k/: 1368 × 2048 = 2.8 megapixels
4. images-jpeg-1k/: 684 × 1024 = 0.7 megapixels
The JPEG images are white-balanced and tone-mapped versions of the HDR images. See the code above for the details.

Each scene uses white-balance settings derived from a ColorChecker, which individually scale the RGB channels as follows:

Scene	RGB scale factors
apartment	`0.726097, 1.0, 1.741252`
kitchen	`0.628143, 1.0, 2.212346`
office1a	`0.740846, 1.0, 1.750224`
office1b	`0.725535, 1.0, 1.839938`
office2	`0.707729, 1.0, 1.747833`
office_view1	`1.029089, 1.0, 1.145235`
office_view2	`0.939620, 1.0, 1.273549`
riverview	`1.077719, 1.0, 1.145992`
seating_area	`0.616093, 1.0, 2.426888`
table	`0.653298, 1.0, 2.139514`
workshop	`0.709929, 1.0, 1.797705`
raf_emptyroom	`0.718776, 1.0, 1.787020`
raf_furnishedroom	`0.721494, 1.0, 1.793423`

Camera calibration in KRT format (`cameras.json`)

This JSON file has the basic structure {"KRT": [<one object per image>]}, where each image object has the following properties:

width: image width, in pixels (usually 5784)
height: image height, in pixels (usually 8660)
cameraId: filename component for this image (e.g. "0/0_REN0001"); to get a complete path, use "{scene}/{imageFormat}/{cameraId}.{extension}" for:
- scene: any of the 11 scene names,
- imageFormat: one of "images-2k", "images-jpeg-2k", "images-jpeg-4k", or "images-jpeg"
- extension: file extension, jpg for JPEGs, exr for EXR images (HDR)
K: 3×3 intrinsic camera matrix for full-resolution image (column-major)
T: 4×4 world-to-camera transformation matrix (column-major)
distortionModel: lens distortion model used:
- "Fisheye" for fisheye images (Eyeful v1)
- "RadialAndTangential" for pinhole images (Eyeful v2)
distortion: lens distortion coefficients for use with OpenCV’s cv2.undistort function
- fisheye images (Eyeful v1): [k1, k2, k3, _, _, _, p1, p2]
  - Note: The projection model is an ideal (equidistant) fisheye model.
- pinhole images (Eyeful v2): [k1, k2, p1, p2, k3] (same order as cv2.undistort)
frameId: position index during capture (consecutive integers)
- all images taken at the same time share the same frameId
sensorId: Metashape sensor ID (aka camera) of this image
- all images taken by the same camera share the same sensorId
cameraMasterId (optional): Metashape camera ID for the master camera (in rig calibration) at this position/frame
- all images taken at the same time share the same cameraMasterId
sensorMasterId (optional): Metashape sensor ID for the master camera in rig calibration
- should have the same value for all cameras except the master camera (usually "6" for Eyeful v1, "13" for Eyeful v2).

World coordinate system: right-handed, y-up, y=0 is ground plane, units are in meters.

Camera calibration in Metashape XML format (`cameras.xml`)

Camera calibration data exported directly from Metashape, using its proprietary file format.

Reconstructed 3D mesh (`mesh.*`)

Textured mesh in OBJ format, exported from Metashape and created from the full-resolution JPEG images.
World coordinate system: right-handed, y-up, y=0 is ground plane, units are in meters.

Exported COLMAP reconstruction (`colmap/`)

These COLMAP reconstructions are exported from our original reconstructions using Metashape 2.1.3 with default parameters.

The images under colmap/images were automatically undistorted from the images in images-jpeg to pinhole projections with principal point at image center.
Note that this undistortion severely crops fisheye images, and tends to produce different image sizes for different cameras.
The images in colmap/images-* are downsampled versions of the full-resolution undistorted images, similar to the Mip-NeRF 360 dataset format.

Training/testing splits (`splits.json`)

Contains lists of images for training ("train") and testing ("test").
All images of one camera are held out for testing: camera 5 for Eyeful v1, and camera 17 for Eyeful v2.

Changelog

3 Nov 2023 – initial dataset release
18 Jan 2024 – added “1K” resolution (684×1024 pixels) EXRs and JPEGs for small-scale experimentation.
19 Apr 2024 – added two rooms from Real Acoustic Fields (RAF) dataset: raf_emptyroom and raf_furnishedroom.
9 Oct 2024 – added exported COLMAP reconstructions with undistorted images in the Mip-NeRF 360 format, e.g. compatible with gsplat.

Citation

If you use any data from this dataset or any code released in this repository, please cite the VR-NeRF paper.

@InProceedings{VRNeRF,
  author    = {Linning Xu and
               Vasu Agrawal and
               William Laney and
               Tony Garcia and
               Aayush Bansal and
               Changil Kim and
               Rota Bulò, Samuel and
               Lorenzo Porzi and
               Peter Kontschieder and
               Aljaž Božič and
               Dahua Lin and
               Michael Zollhöfer and
               Christian Richardt},
  title     = {{VR-NeRF}: High-Fidelity Virtualized Walkable Spaces},
  booktitle = {SIGGRAPH Asia Conference Proceedings},
  year      = {2023},
  doi       = {10.1145/3610548.3618139},
  url       = {https://vr-nerf.github.io},
}

License

Creative Commons Attribution-NonCommercial (CC BY-NC) 4.0, as found in the LICENSE file.

[Terms of Use] [Privacy Policy]

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
media		media
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The VR-NeRF Eyeful Tower Dataset

Dataset Overview

Example Images

Capture Rig

Download instructions

Download a single scene (1K JPEGs only)

Download all scenes (1K JPEGs only) using bash — 9 GB

Download the entire Eyeful Tower dataset — 3.5 TB

Data Organization

HDR images (`images-1k/{camera}/.exr` and `images-2k/{camera}/.exr`)

Example code: reading EXR images to create JPEGs

JPEG images (`images-jpeg/{camera}/.jpg`)

Camera calibration in KRT format (`cameras.json`)

Camera calibration in Metashape XML format (`cameras.xml`)

Reconstructed 3D mesh (`mesh.*`)

Exported COLMAP reconstruction (`colmap/`)

Training/testing splits (`splits.json`)

Changelog

Citation

License

About

Contributors 4

License

facebookresearch/EyefulTower

Folders and files

Latest commit

History

Repository files navigation

The VR-NeRF Eyeful Tower Dataset

Dataset Overview

Example Images

Capture Rig

Download instructions

Download a single scene (1K JPEGs only)

Download all scenes (1K JPEGs only) using bash — 9 GB

Download the entire Eyeful Tower dataset — 3.5 TB

Data Organization

HDR images (images-1k/{camera}/*.exr and images-2k/{camera}/*.exr)

Example code: reading EXR images to create JPEGs

JPEG images (images-jpeg*/{camera}/*.jpg)

Camera calibration in KRT format (cameras.json)

Camera calibration in Metashape XML format (cameras.xml)

Reconstructed 3D mesh (mesh.*)

Exported COLMAP reconstruction (colmap/)

Training/testing splits (splits.json)

Changelog

Citation

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Contributors 4

HDR images (`images-1k/{camera}/.exr` and `images-2k/{camera}/.exr`)

JPEG images (`images-jpeg/{camera}/.jpg`)

Camera calibration in KRT format (`cameras.json`)

Camera calibration in Metashape XML format (`cameras.xml`)

Reconstructed 3D mesh (`mesh.*`)

Exported COLMAP reconstruction (`colmap/`)

Training/testing splits (`splits.json`)