Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"ValueError: Attempting to unscale FP16 gradients" in vicuna_v1.1 #9

Open
Zhudongsheng75 opened this issue Sep 29, 2023 · 2 comments
Open

Comments

@Zhudongsheng75
Copy link

Zhudongsheng75 commented Sep 29, 2023

I have a question I would like to share with the authors. I would be very grateful if you could reply.

As far as I understand, your work follows instructblip. However, in the original paper of instructblip, the LLM weight they used is vicuna_v1.1 instead of v0.1 here. Why did you choose different LLM weights?

In fact, I tried vicuna_v1.1 for training, but I encountered the error mentioned in the title, "ValueError: Attempting to unscale FP16 gradients". Through positioning, I found that the main problem may be caused by the following code in BLIVA/bliva/models/blip2_vicuna_instruct.py:

self.llm_tokenizer.add_special_tokens({'pad_token': '[PAD]'})
self.llm_tokenizer.add_special_tokens({'bos_token': '</s>'})
self.llm_tokenizer.add_special_tokens({'eos_token': '</s>'})
self.llm_tokenizer.add_special_tokens({'unk_token': '</s>'})

self.llm_model.resize_token_embeddings(len(self.llm_tokenizer))

Did you encounter similar problems and therefore replaced v1.1 with v0.1?

@gordonhu608
Copy link
Collaborator

Thank you for your interest in our work. Could you please also try training with version 0.1 with the same setting to verify this is the problem? v1.1 and v0.1 are only different in tokenization and separator.

@Zhudongsheng75
Copy link
Author

Thank you for your reply. Does the difference between v0.1 and v1.1 only exist in tokenization and separator? I tried generating with v0.1 and v1.1 respectively and got completely different results. Using a mismatched vicuna version for generation will result in confusing generation results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants