Pytorch implementation of image caption problem.
This is an implementation of image caption, based on two different papers. The two papers are:
- Show and Tell: A Neural Image Caption Generator
- Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
The code is based on a-PyTorch-Tutorial-to-Image-Captioning.
In order to run the code, a file called "dataset_coco.json" need to be download and put into the data folder. You can download the file here.
- run pip install -r requirement.txt
- run chmod +x download.sh
- run ./download.sh
- run python create_input_files.py
- run python train-traditional.py
This is for the paper "Show and Tell: A Neural Image Caption Generator" - run python train-attention.py
This is for the paper "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
The testing code is tested under pycharm environment.
- run caption-traditional.py
- run caption-attention.py
You can download the pretrained model here
- The traiditional model, the password is
yl2u
. - The attentaion model, the password is
lsv7
.