Skip to content

Multi-modal fusion techniques for generating images using cityscapes dataset

License

Notifications You must be signed in to change notification settings

onat-dalmaz/multi-modal-I2I

Repository files navigation

multi-modal-pix2pix

We perform Multi-modal fusion techniques on Cityscapes dataset. We extract the boundary maps from the Instance maps of the dataset to acquire a new modality.

Dependencies

python>=3.6
torch>=1.4.0
torchvision>=0.5.0
scikit-learn>=0.20.2
opencv-python>=4.0
dominate>=2.4.0
visdom>=0.1.8.8

Installation

  • Clone this repo:
git clone https://github.com/onimu23/multi-modal-pix2pix
cd multi-modal-pix2pix

Dataset

First, download the Cityscapes datset from the official web page: https://www.cityscapes-dataset.com/ You should download gtFine_trainvaltest.zip and leftImg8bit_trainvaltest.zip . Unzip the files. After that, you can use preparecityscapes.py script in order to preprocess the dataset, and extract the boundary maps from the instance maps.

python3 preparecityscapes.py 

For pix2pix

The data should be in dataset/cityscapes/ folder. For training,

python3 train.py --dataroot ./datasets/datasets/cityscapes/ --name cityscapes_boundary --model pix2pix --direction BtoA --input_nc 2

For testing,

python3 test.py --dataroot ./datasets/cityscapes --name cityscapes_boundary --model pix2pix --direction BtoA

The results will be places under the results folder.

For CEN

You should structure your dataset in the following way:

/CEN/image2image_translation/data/
  ├── semantic_map 
  ├── boundary_map  
  ├── image   
  ├── train_domain.txt    
  └── val_domain.txt

In the preprocessing script, you find the relevant parts commented, you should uncomment those parts to structure your dataset in this fashion. For training the model, you can use the following script.

cd CEN
python3 main.py --gpu 0 --img-types 0 1 2 -c pix2pix_boundary

Modalities

0:semantic_map
1:boundary_map
2:image

The script evaluates the model on the validation set based on FID and KID scores in every 5 epochs. The results will be stored in

/CEN/image2image_translation/ckpt/pix2pix_boundary/results/

FID KID scores

For calculating FID and KID scores between the results and the real images

cd FID_KID
python3 fid_kid.py --fake_path /auto/data2/odalmaz/CVproject/pytorch-CycleGAN-and-pix2pix/results/cityscapes_boundary/test_latest/images/fake_B/ --real_path /auto/data2/odalmaz/CVproject/pytorch-CycleGAN-and-pix2pix/results/cityscapes_boundary/test_latest/images/real_B/

You should specify the generated images path in --fake_path and the real images in --real_path

Acknowledgments

This code borrows heavily from pytorch-CycleGAN-and-pix2pix and CEN

About

Multi-modal fusion techniques for generating images using cityscapes dataset

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages