This repository is for Bottom-up and Top-down Object Inference Networks for Image Captioning
- Python 3
- CUDA 10
- numpy
- tqdm
- easydict
- PyTorch (>1.0)
- torchvision
- coco-caption
- Download the bottom up features and convert them to npz files
python3 tools/create_feats.py --infeats bottom_up_tsv --outfolder ./mscoco/feature/up_down_10_100
-
Download the annotations into the mscoco folder. More details about data preparation can be referred to self-critical.pytorch
-
Download coco-caption and setup the path of __C.INFERENCE.COCO_PATH in lib/config.py
bash experiments/btonet/train.sh
CUDA_VISIBLE_DEVICES=0 python3 main_test.py --folder experiments/btonet --resume model_epoch
Thanks the contribution of self-critical.pytorch and awesome PyTorch team.