This repo contains a build script for the Imagenet data set. After cloning the repository, in order to use the build script, you will first need to obtain an account with Imagenet and then download the following files.
ILSVRC2012_devkit_t12.tar.gz
ILSVRC2012_img_train.tar
ILSVRC2012_img_val.tar
ILSVRC2012_img_test_v10102019.tar
File 1 containing the Imagenet development kit should be moved into the ./src
directory
and then extracted.
$ tar -xzf ILSVRC2012_devkit_t12.tar.gz
Files 2-4 contain the raw image files and should be moved to the ./data
directory. You do
not need to extract files 2-4 the extraction process will be handled by the build script.
The following commands can be used to create and activate the Conda environment containing the necessary Python packages to build the Imagenet data set.
$ conda env create --prefix ./env --file environment.yml
$ conda activate ./env
Running the following commands will extract and re-organize the raw *.JPEG images that comprise
the Imagenet classification and localization data set. The resulting training, validation, and
testing images can be found in ./data/jpeg/train
, ./data/jpeg/val
, and ./data/jpeg/test
,
respectively.
$ cd ./src
$ python build_classification_localization_data.py