This repo explains how to download & process ImageNet-1K train/val dataset for using as a dataset
- Download ImageNet-1K train/val dataset from academic torrents : train link, val link
- Check-out my velog post for download on linux server : link
- Check-out more informations on original ImageNet website : link
- ImageNet-1K train dataset zip contains zips like below
└── ILSVRC2012_img_train.tar
├── n01440764.tar
├── n01443537.tar
├── n01484850.tar
├── ...
└── n15075141.tar
- ImageNet-1K val dataset zip contains images like below
└── ILSVRC2012_img_val.tar
├── ILSVRV2012_val_00000001.JPEG
├── ILSVRV2012_val_00000002.JPEG
├── ILSVRV2012_val_00000003.JPEG
├── ...
└── ILSVRV2012_val_00050000.JPEG
ImageNet_class_index.json
: include class infos- Caution : same label with different class num exists
- crane : 134, 517
- maillot : 638, 639
- Caution : same label with different class num exists
ImageNet_val_label.txt
: include validation image labelcheck.py
: check if unpacked right or notunpack.py
: make clean file trees ofILSVRC2012_img_train.tar
,ILSVRC2012_img_val.tar
for using as a dataset
- Assume all the required files are in same directory like below (base_dir)
└── base_dir
├── ILSVRC2012_img_train.tar
├── ILSVRC2012_img_val.tar
├── ImageNet_class_index.json
└── ImageNet_val_label.txt
- From
unpack.py
, changebase_dir
andtarget_dir
variables
- Run
unpack.py
and it makes file trees in specific directory like below (target_dir)
└── target_dir
├── train
│ ├── 0
│ │ ├── n01440764_18.JPEG
│ │ ├── n01440764_36.JPEG
│ │ └── ...
│ ├── 1
│ ├── ...
│ └── 999
└── val
├── 0
│ ├── ILSVRC2012_val_00000293.JPEG
│ ├── ILSVRC2012_val_00002138.JPEG
│ └── ...
├── 1
├── ...
└── 999
- From
check.py
, changeImageNet_dir
variable and run for double-check