Skip to content

Latest commit

 

History

History
144 lines (103 loc) · 5.85 KB

README.md

File metadata and controls

144 lines (103 loc) · 5.85 KB

Spectral Hint GAN

Framework: PyTorch License

This repo hosts the official implementary of:

Xingqian Xu, Shant Navasardyan, Vahram Tadevosyan, Andranik Sargsyan, Yadong Mu and Humphrey Shi, Image Completion with Heterogeneously Filtered Spectral Hints, Paper arXiv Link.

News

  • [2022.11.12]: Evaluation code and pretrained model released.
  • [2022.11.07]: Our paper is accepted in WACV23.
  • [2022.11.06]: Repo initiated.

Introduction

Spectral Hint GAN (SH-GAN) is an high-performing inpainting network enpowered by CoModGAN and novel spectral processing techniques. SH-GAN reaches state-of-the-art on FFHQ and Places2 with freeform masks.

Network and Algorithm

The overall structure of our SH-GAN shows in the following figure:

The sturcture of our Spectral Hint Unit shows in the following graph:

Heterogeneous Filtering Explaination:

  • 1x1 Convolution in Fourier domain leads a uniform (homogeneous) transform from one spectral space to another.
  • ReLU in Fourier domain is like a value-dependend band pass filter that zero out some frequency values.
  • We promote the heterogeneous transforms in spectral space, in which the frequency value transformations are depended on the frequency bands.

Gaussian Split Algorithm Explaination:

  • Gaussian Split is a spectral space downsampling method that well-suit deep learning structures. A quick intuition is that it likes Wavelet Transform that can pass information in different frequency band to its corresponding resolution.

Data

We use FFHQ and Places2 as our main dataset. Download these dataset from the following official link: FFHQ, Places2

Directory of FFHQ data for our code:

├── data
│   └── ffhq
│       └── ffhq256x256.zip
│       └── ffhq512x512.zip

Directory of Places2 data for our code:

  • Download the data_challenge.zip from Places2 official website and decompress it to /data/Places2
  • Same for val_large.zip
├── data
│   └── Places2
│       └── data_challenge
│           ...
│       └── val_large
│           ...

Setup

conda create -n shgan python=3.8
conda activate shgan
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
pip install -r requirement.txt

Results and pretrained models

DIM DATA FID LPIPS PSNR SSIM Download
CoModGAN 256 FFHQ 4.7755 0.2568 16.24 0.5913
SH-GAN 256 FFHQ 4.3459 0.2542 16.37 0.5911 link
CoModGAN 512 FFHQ 3.6996 0.2469 18.46 0.6956
SH-GAN 512 FFHQ 3.4134 0.2447 18.43 0.6936 link
CoModGAN 256 Places2 9.3621 0.3990 14.50 0.4923
SH-GAN 256 Places2 7.5036 0.3940 14.58 0.4958 link
CoModGAN 512 Places2 7.9735 0.3420 16.00 0.5953
SH-GAN 512 Places2 7.0277 0.3386 16.03 0.5973 link

Evaluation

Here are the one-line shell commends to evaluation SH-GAN on FFHQ 256/512 and Places2 256/512.

python main.py --experiment shgan_ffhq256_eval --gpu 0 1 2 3 4 5 6 7 --eval 99999
python main.py --experiment shgan_ffhq512_eval --gpu 0 1 2 3 4 5 6 7 --eval 99999
python main.py --experiment shgan_places256_eval --gpu 0 1 2 3 4 5 6 7 --eval 99999
python main.py --experiment shgan_places512_eval --gpu 0 1 2 3 4 5 6 7 --eval 99999

Also you need to:

  • Download the data, put them as the directories mentioned in Data session.
  • Create ./pretrained and move all downloaded pretrained models in it.
  • Create ./log/shgan_ffhq/99999_eval and ./log/shgan_places2/99999_eval

Some simple things to do to resolve the issues:

  • The evaluation code caches and later relys on .cache/****_real_feat.npy for FID calculation. If it corrupts, numbers will be wrong. But you can simple remove it and the code will auto recompute one.
  • The final stage of FID computation requires CPU resource so it is normal to be slow, so be patient.

Training

coming soon

Citation

@inproceedings{xu2023image,
  title={Image Completion with Heterogeneously Filtered Spectral Hints},
  author={Xu, Xingqian and Navasardyan, Shant and Tadevosyan, Vahram and Sargsyan, Andranik and Mu, Yadong and Shi, Humphrey},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={4591--4601},
  year={2023}
}

Acknowledgement

Part of the codes reorganizes/reimplements code from the following repositories: Comodgan official Github and Stylegan2-ADA official Github.