Skip to content

Latest commit

 

History

History
54 lines (44 loc) · 2.65 KB

README.md

File metadata and controls

54 lines (44 loc) · 2.65 KB

miniCLIP

Implementation of CLIP model with a reduced capacity. For self-educational purposes only.

clip_summary

This repo currently contains only CLIP-ResNet implementation, while in the original paper there are 5 ResNets and 3 ViTs models. There was no intention to beat SotA or train a superior version of CLIP. This is just an attempt to understand the logic behind CLIP.

Preliminary results

After training CLIP-ResNet50 for 10 epochs, the following results were obtained.

As can be seen, the results are not great, but the model is definetely trying to stick closer to correct pairs.

Example usage

Train

To run the training, you should first download the COCO dataset and provide paths to annotations and images for both train and val in a config (check example here). After that, run:

python tools/train.py --path_to_config=configs/clip_base.yaml --path_to_log=logs/

This will create directory structure under the logs/ directory for each run separately (aka experiment directories):

logs/
  |--{experiment_name}/
      |--artifacts/
      |--checkpoints/
      |--train.log
      |--{experiment_name}.yaml               

Under the logs/{experiment_name}/artifacts/ a training_progress.log will be saved, containing losses for train and validation. Each training run generates an overrided config and saves it under the logs/{experiment_name}/ directory.

Plot similarity matrices

To plot similarity matrices on validation dataset, run:

python tools/plot_similarities.py --path_to_config=logs/{experiment_name}/{experiment_name}.yaml \
                                  --path_to_ckpt=logs/{experiment_name}/checkpoints/some_ckpt.pth \
                                  --n_pairs=8 \
                                  --n_matricies=5

Here, n_matricies denotes number of similarity matrices to create, and n_pairs denotes number of image-text pairs to include into each similarity matrix. All the similarity matrices will be saved under logs/{experiment_name}/artifacts/.