Skip to content

Latest commit

 

History

History
119 lines (84 loc) · 6.59 KB

README.md

File metadata and controls

119 lines (84 loc) · 6.59 KB

Birds of the British Empire

Pytorch implementation for reproducing AttnGAN results in the paper AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks by Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He. (This work was performed when Tao was an intern with Microsoft Research).

Dependencies

python 3.5

Pytorch

In addition, please add the project folder to PYTHONPATH and pip install the following packages:

  • python-dateutil
  • easydict
  • pandas
  • torchfile
  • nltk
  • scikit-image

Data

  1. Download our preprocessed metadata for birds coco and save them to data/
  2. Download the birds image data. Extract them to data/birds/
  3. Download coco dataset and extract the images to data/coco/

Training

  • Pre-train DAMSM models:

    • For bird dataset: python pretrain_DAMSM.py --cfg cfg/DAMSM/bird.yml --gpu 0
    • For coco dataset: python pretrain_DAMSM.py --cfg cfg/DAMSM/coco.yml --gpu 1
  • Train AttnGAN models:

    • For bird dataset: python main.py --cfg cfg/bird_attn2.yml --gpu 2
    • For coco dataset: python main.py --cfg cfg/coco_attn2.yml --gpu 3
  • *.yml files are example configuration files for training/evaluation our models.

Pretrained Model

Sampling

  • Run python main.py --cfg cfg/eval_bird.yml --gpu 1 to generate examples from captions in files listed in ./data/birds/example_filenames.txt. Results are saved to DAMSMencoders/.
  • For sampling, be sure to set TRAIN.FLAG and B_VALIDATION to False. In case of executing the model on a CPU, set --gpu parameter to a negative value. The file example_filenames.txt should contain a list of files, where each file has a one caption per line. After execution, AttnGAN will generate 3 image files (with different qualities) and 2 attention maps.
  • Change the eval_*.yml files to generate images from other pre-trained models.
  • Input your own sentence in "./data/birds/example_captions.txt" if you wannt to generate images from customized sentences.

Validation

  • To generate images for all captions in the validation dataset, change B_VALIDATION to True in the eval_*.yml. and then run python main.py --cfg cfg/eval_bird.yml --gpu 1
  • We compute inception score for models trained on birds using StackGAN-inception-model.
  • We compute inception score for models trained on coco using improved-gan/inception_score.

Examples generated by AttnGAN [Blog]

bird example coco example

Creating an API

Evaluation code embedded into a callable containerized API is included in the eval\ folder.

Using InterfaceGAN to customize bird generation

For a given bird attribute in attributes.txt , using InterfaceGAN we can obtain a direction for latent code manipulation, in order to make it more positive/negative for such attribute.

To obtain the direction as numpy array, InterfaceGAN needs a set of latent codes and their corresponding attribute values. The following files support that process:

  • batch_generate_birds.py generates bird images using random latent codes. The latent codes are stored in noise_vectors_array.npy and image information, including file location, is saved in the metadata_file.csv file.
  • organize_image_folder.py will organise images in the Caltech-UCSD Birds into train and validation folders for an specific attribute from attributes.txt. This is needed for training a feature predictor for that attribute.
  • train_feature_predictor.py will train a transfer-learning based feature predictor, using the folder organised via organize_image_folder.py as data input. Model state will be stored in the feature_predictor.pt file.
  • batch_predict_feature.py will predict the value of a feature using the model trained with train_feature_predictor.py, over images generated using the noise_vectors_array.npy latent codes. Features values will be stored in the predictions.npy numpy array.

We can later feed noise_vectors_array.npy and predictions.npy to the train_boundary.py module of InterfaceGAN to obtain the direction for attribute manipulation.

Once we have the boundary as a numpy array, can use the AttnGAN/code/main.py file for image generation and interpolation. Use the attnganw/config.py to configure the interpolation parameters.

Citing AttnGAN

If you find AttnGAN useful in your research, please consider citing:

@article{Tao18attngan,
  author    = {Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He},
  title     = {AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks},
  Year = {2018},
  booktitle = {{CVPR}}
}

Reference