Data

1. General Description
2. Download
3. Dataset Conversion
4. References

1. General Description

This dataset is used to train Spatial Language Integrating Model (SLIM) in [1]. It consists of virtual scenes with ten views for each scene. Each scene consists of two or three objects placed on a square walled room. Each view is represented by an image, and synthetic or natural language descriptions. View images are 3D pictures rendered from a particular scene from ten different camera viewpoints.

2. Download

gustil need to be installed to download the dataset. The dataset is available from here. Because the dataset is huge, about 600 GB, I downloaded the whole dataset except the synthetic_data/train data.

Use the command below to download the dataset.

gsutil -m cp -c -L manifest.log -r \
  "gs://slim-dataset/turk_data/" \
  ./<DATASET_FOLDER>/

After downloading is finished, make sure to manually create test, valid, and train directories under the turk_data and move the respective files under them. The dataset directory should be as shown below.

<DATASET_FOLDER>
└── synthetic_data
    └── turk_data
        ├── test
        ├── train
        └── valid

3. Dataset Conversion

Dataset files are in tfrecord fromat. TFRecord file format is a binary storage format which are optimized to be used with Tensorflow. As I will use pyTorch, the dataset files are converted to pt.gz format. To convert the dataset use the following command:

./convert_slim_dataset.sh <absolute_path/to/dataset_folder>

If you want to use the default value, just use: ./convert_slim_dataset.sh

After conversion, dataset directory will be as follows:

<DATASET_FOLDER>
├── synthetic_data
│   └── turk_data
│       ├── test
│       ├── train
│       └── valid
└── turk_data_torch
    ├── test
    ├── train
    └── valid

4. References

[1] Ramalho, T., Kočiský, T., Besse, F., Eslami, S. M., Melis, G., Viola, F., ... & Hermann, K. M. (2018). Encoding spatial relations from natural language. arXiv preprint arXiv:1807.01670.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Data

1. General Description

2. Download

3. Dataset Conversion

4. References

Files

README.md

Latest commit

History

README.md

File metadata and controls

Data

1. General Description

2. Download

3. Dataset Conversion

4. References