This repository contains the training code for classifying temples by their origin country. Currently, there are multiples models supported:
ResNet-50 / Resnet-101, PSPNet-50 / PSPNet-101 and
VGG. The preprocessing is based on the Inception paper.
The input image is resized to 256x256
and crops of random size (0.08 to 1.0 of the original size) and a random aspect ratio
(3/4 to 4/3 of the original aspect ratio) are made. This crop is finally resized to 224x224
. Furthermore, random 90 degrees rotations
of the input image and random gaussian blur is applied. Optionally, a random patch from the image is erased (pixel values are set to 0).
- Download the temples dataset.
- Set the path to your dataset in the temples dataloader file. Currently, 85% of the dataset is used for training and 15% for validation.
- Check the available parser options.
- Create the environment from the conda file:
conda env create -f environment.yml
- Activate the conda environment:
conda activate toptal
- Train the networks using the provided training script. The trained model is saved to the
save_dir
command line argument. - Run the inference script on your set. The command line argument
test_dir
should be used to provide the relative path to the folder which contains the images to be classified. A fileresults.csv
will be created containing the name of the files in the folder and the corresponding predicted class.
Model | Accuracy |
---|---|
ResNet-50 | 0.8143 |
ResNet-100 | 0.802 |
PSPNet-50 | 0.785 |
PSPNet -100 | 0.778 |
Model | Accuracy |
---|---|
ResNet-50 | 0.8143 |
ResNet-50 random erasing | 0.8333 |