fix coca_cfg #691

gpucce · 2023-10-22T22:44:28Z

No description provided.

rwightman · 2023-10-22T22:49:06Z

@gpucce much clearer, haha, thanks.

So, I'm a bit confused by this. I get that class token is being added in the model, but why would you want to pass a seq of 77 tokens then instead of 76?

context_length == the length of the sequence passed in the text input. So in essence if you were passing 77 and then adding a class token it was ending up as 78 no?

gpucce · 2023-10-22T22:55:22Z

@rwightman

@gpucce much clearer, haha, thanks.

So, I'm a bit confused by this. I get that class token is being added in the model, but why would you want to pass a seq of 77 tokens then instead of 76?

context_length == the length of the sequence passed in the text input. So in essence if you were passing 77 and then adding a class token it was ending up as 78 no?

Honestly I am not 100% sure why I did it. I think initially it would error with more than 77 tokens, I think the issue was a sort of inconsistency between the text encoder and the multimodal decoder that would error if context length was longer than 77. It could be that it now works now I am rushing a bit because it is very late and I think this would keep it same as it was until now but with the right tokenizer in the get_tokenizer as you said.

Relatedly I don't think I can continue with the regression tests now and in the next days I will be very busy, can I ask you how urgent it would be to have it finished?

rwightman · 2023-10-22T22:59:13Z

@gpucce generation seems to be working okay, there might be an off by one issue related to this but if it's erring on the lower side (passing 76) tokens it shoudn't be harmful no? I can poke around a bit more and see if there is any reason to hold off a release, but either way, get some sleep, thanks for spending the time that you have.

gpucce · 2023-10-22T23:02:19Z

@gpucce generation seems to be working okay, there might be an off by one issue related to this but if it's erring on the lower side (passing 76) tokens it shoudn't be harmful no? I can poke around a bit more and see if there is any reason to hold off a release, but either way, get some sleep, thanks for spending the time that you have.

I think in fact the model itself is totally fine, the logits from the forward are the same as they used to be and they have in several older version, I have been trying the whole a bit today, so it should be fine, the issue I had was this off by one, but I am now very confident it is only in the tokenizer not in the model.

rwightman · 2023-10-22T23:11:05Z

@gpucce yeah, so the tokenizer behaviour did change, but think the issue is a bit of muddled logic wrt to seq len / context len handling in the model. context_length attribute should reflect the length of the sequence that can be passed into the model inputs. You can pass a 77 length input to this model even though context length was 76. Tokenizers are doing what they should now. But before we always fed 77 tokens into the model even though the context length didn match that and model logic was adapted to work around that...

gpucce · 2023-10-22T23:15:23Z

@gpucce yeah, so the tokenizer behaviour did change, but think the issue is a bit of muddled logic wrt to seq len / context len handling in the model. context_length attribute should reflect the length of the sequence that can be passed into the model inputs. You can pass a 77 length input to this model even though context length was 76. Tokenizers are doing what they should now. But before we always fed 77 tokens into the model even though the context length didn match that and model logic was adapted to work around that...

@rwightman I agree on all of it, my only comment was I couldn't find any other issue with generation so hopefully there aren't

rwightman · 2023-10-22T23:17:03Z

@gpucce k, thanks, yeah it seems like the generation is probably okay, so I'll investigate this issue a bit more to ensure that it doesn't cause any significant issues

rwightman · 2023-10-23T02:57:38Z

k, so strikes me that the embed_cls bool arg should be removed altogether? it's always chopping the last token on the input when True (which seems like it could cause other problems, ie by removing the eot token in some cases). It looks like it was added to workaround the tokenizers having const 77 token output? With tokenizers obeying the context_length attr in the model, seems we should leave that as 76 and we can safely add the cls embed without going over.


    def _encode_text(self, text, normalize: bool = True, embed_cls: bool = True):
        text = text[:, :-1] if embed_cls else text # make space for CLS token
        text_latent, token_emb = self.text(text)
        text_latent = F.normalize(text_latent, dim=-1) if normalize else text_latent
        return text_latent, token_emb

    def encode_image(self, images, normalize: bool = True):
        image_latent, _ = self._encode_image(images, normalize=normalize)
        return image_latent

    def encode_text(self, text, normalize: bool = True, embed_cls: bool = True):
        text_latent, _ = self._encode_text(text, normalize=normalize, embed_cls=embed_cls)
        return text_latent

gpucce · 2023-10-23T15:27:58Z

#692 addresses this in a better way

fix coca_cfg

2172dd0

gpucce marked this pull request as draft October 22, 2023 22:45

rwightman mentioned this pull request Oct 23, 2023

Remove cls_embed arg from forward/encode_image fns #692

Merged

gpucce closed this Oct 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix coca_cfg #691

fix coca_cfg #691

gpucce commented Oct 22, 2023

rwightman commented Oct 22, 2023

gpucce commented Oct 22, 2023 •

edited

Loading

rwightman commented Oct 22, 2023

gpucce commented Oct 22, 2023

rwightman commented Oct 22, 2023

gpucce commented Oct 22, 2023

rwightman commented Oct 22, 2023

rwightman commented Oct 23, 2023 •

edited

Loading

gpucce commented Oct 23, 2023

fix coca_cfg #691

fix coca_cfg #691

Conversation

gpucce commented Oct 22, 2023

rwightman commented Oct 22, 2023

gpucce commented Oct 22, 2023 • edited Loading

rwightman commented Oct 22, 2023

gpucce commented Oct 22, 2023

rwightman commented Oct 22, 2023

gpucce commented Oct 22, 2023

rwightman commented Oct 22, 2023

rwightman commented Oct 23, 2023 • edited Loading

gpucce commented Oct 23, 2023

gpucce commented Oct 22, 2023 •

edited

Loading

rwightman commented Oct 23, 2023 •

edited

Loading