Fix ViT-SO400M-14-SigLIP context length #1001

shkarupa-alex · 2024-11-27T13:04:04Z

According to paper all SigLIP models have context length = 64

rwightman · 2024-11-29T20:22:46Z

rwightman · 2024-11-29T20:26:24Z

Hmm screenshot doesn't seem to work, but @shkarupa-alex this notebook is taken by me to be the source of truth on these models and it says 16 https://github.com/google-research/big_vision/blob/main/big_vision/configs/proj/image_text/SigLIP_demo.ipynb

shkarupa-alex · 2024-12-02T07:20:04Z

Yes, but ...
TF ViT uses valid padding in stem https://github.com/google-research/big_vision/blob/main/big_vision/models/vit.py#L212
This means that real image size it see is 384 // 14 = floor(27.4) * 14 = 27 * 14 = 378

To have the same logits as in pretraining size 384 should be kept (this error may be seen as crop augmentation). But in inference we want to see the whole image without "cropping".

rwightman · 2024-12-02T17:36:36Z

@shkarupa-alex that's a completely separate issue, and that's why there's both a 378x378 and 384x384 version of those model configs with the same weights, there was a mistake in the original because 378 should have been used w/ 14x14 patch size but they chose 384. using 378 in processing yields better results because passing a 384 image will be truncated to 378, but I left a 384 config so it matches the official impl.

Fix ViT-SO400M-14-SigLIP context length

a5a99d0

According to paper all SigLIP models have context length = 64

rwightman closed this Dec 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ViT-SO400M-14-SigLIP context length #1001

Fix ViT-SO400M-14-SigLIP context length #1001

shkarupa-alex commented Nov 27, 2024

rwightman commented Nov 29, 2024

rwightman commented Nov 29, 2024

shkarupa-alex commented Dec 2, 2024

rwightman commented Dec 2, 2024

Fix ViT-SO400M-14-SigLIP context length #1001

Fix ViT-SO400M-14-SigLIP context length #1001

Conversation

shkarupa-alex commented Nov 27, 2024

rwightman commented Nov 29, 2024

rwightman commented Nov 29, 2024

shkarupa-alex commented Dec 2, 2024

rwightman commented Dec 2, 2024