The ArtBench Dataset: Benchmarking Generative Models with Artworks
Peiyuan Liao*, Xiuyu Li*, Xihui Liu, Kurt Keutzer
* equal contribution
ArtBench-10 is the first class-balanced, high-quality, cleanly annotated, and standardized dataset for benchmarking artwork generation. It comprises 60,000 images of artwork from 10 distinctive artistic styles, with 5,000 training images and 1,000 testing images per style.
ArtBench-10 has several advantages over previous artwork datasets:
- it is class-balanced while most previous artwork datasets suffer from the long tail class distributions
- the images are of high quality with clean annotations
- it is created with standardized data collection, annotation, filtering, and preprocessing procedures.
We provide three versions of the dataset with different resolutions (32 x 32, 256 x 256, and original image size), formatted in a way that is easy to be incorporated by popular machine learning frameworks.
- Metadata as a csv file
- 32x32 CIFAR-python: works seamlessly with implementations using the CIFAR-10 dataset
- 32x32 CIFAR-binary: great compatibility with C programs, tensorflow-datasets, etc.
- 256x256 ImageFolder, 256x256 ImageFolder with train-test split (recommended) work seamlessly with PyTorch Vision's ImageFolder implementation
- original size LSUN, per-style: works seamlessly with implementations using LSUN datasets
See artbench.py
for PyTorch usage. You only need ~20 lines of code to start using ArtBench-10 in your PyTorch workloads!
If you find the work useful in your research, please consider citing:
@article{liao2022artbench,
title={The ArtBench Dataset: Benchmarking Generative Models with Artworks},
author={Liao, Peiyuan and Li, Xiuyu and Liu, Xihui and Keutzer, Kurt},
journal={arXiv preprint arXiv:2206.11404},
year={2022}
}