Skip to content

hughhan1/artwork

Repository files navigation

Artwork

Classification of Artwork Genres Using Image Data.

TODO

  • Design classification labels. (Perhaps we can use year intervals for now.)
  • Load data vectors and true labels into SciKit-Learn (sklearn), and begin training.

Setup

First, clone this repository and cd into it.

$ git clone https://github.com/hughhan1/artwork.git
$ cd artwork

Next, create a virtual environment and install the necessary pip packages.

$ virtualenv env
$ source env/bin/activate
$ pip install -r requirements.txt

Before we do anything further, we need to retrieve our data. This data is available from MoMA's GitHub repository in a JSON format, available here. Save this file as json/artworks.json.

Scraping

To scrape images from the MoMA collection, run the following command.

$ python moma.py

The -t or --thumbnails option can be provided to obtain thumbnails instead of larger images, as shown below.

$ python moma.py -t

Regular sized images will be saved into the images/ directory, and thumbnails will be saved into the thumbnails/ directory.

Image Processing/ Dataset Creation

Images must be stored in a directory named "images", only containing the revant JPEG images needed to be processed (remove any extranoeos files, ie .*).

To create an h5py dataset file, run the following command.

$ python utils.py "dataset_1 datset_2 ..."

where "dataset_1 datset_2 ..." is a list of datasets (i.e. moma, getty, etc) used, separted by spaces

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages