Skip to content

ImageNet-1K data download, processing for using as a dataset

Notifications You must be signed in to change notification settings

Jasonlee1995/ImageNet-1K

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ImageNet-1K

This repo explains how to download & process ImageNet-1K train/val dataset for using as a dataset

1. Data Download

  • Download ImageNet-1K train/val dataset from academic torrents : train link, val link
  • Check-out my velog post for download on linux server : link
  • Check-out more informations on original ImageNet website : link

2. Data Processing

2.1. About Data

2.1.1. ImageNet-1K Train Dataset

  • ImageNet-1K train dataset zip contains zips like below
└── ILSVRC2012_img_train.tar
    ├── n01440764.tar
    ├── n01443537.tar
    ├── n01484850.tar
    ├── ...
    └── n15075141.tar

2.1.2. ImageNet-1K Val Dataset

  • ImageNet-1K val dataset zip contains images like below
└── ILSVRC2012_img_val.tar
    ├── ILSVRV2012_val_00000001.JPEG
    ├── ILSVRV2012_val_00000002.JPEG
    ├── ILSVRV2012_val_00000003.JPEG
    ├── ...
    └── ILSVRV2012_val_00050000.JPEG

2.2. Files Explain

  • ImageNet_class_index.json : include class infos
    • Caution : same label with different class num exists
      • crane : 134, 517
      • maillot : 638, 639
  • ImageNet_val_label.txt : include validation image label
  • check.py : check if unpacked right or not
  • unpack.py : make clean file trees of ILSVRC2012_img_train.tar, ILSVRC2012_img_val.tar for using as a dataset

2.3. Run


  1. Assume all the required files are in same directory like below (base_dir)
└── base_dir
    ├── ILSVRC2012_img_train.tar
    ├── ILSVRC2012_img_val.tar
    ├── ImageNet_class_index.json
    └── ImageNet_val_label.txt

  1. From unpack.py, change base_dir and target_dir variables

  1. Run unpack.py and it makes file trees in specific directory like below (target_dir)
└── target_dir
    ├── train
    │   ├── 0
    │   │   ├── n01440764_18.JPEG
    │   │   ├── n01440764_36.JPEG
    │   │   └── ...
    │   ├── 1
    │   ├── ...
    │   └── 999
    └── val
        ├── 0
        │   ├── ILSVRC2012_val_00000293.JPEG
        │   ├── ILSVRC2012_val_00002138.JPEG
        │   └── ...
        ├── 1
        ├── ...
        └── 999

  1. From check.py, change ImageNet_dir variable and run for double-check

image


About

ImageNet-1K data download, processing for using as a dataset

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages