-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fsdp Error report #232
Comments
Ah, sorry about that! The issue is from this line of the FSDP wrapping function. The MPT models are still missing some standard HF Transformers functions. Would it work for your use case to comment out the aforementioned line from our codebase? The output embedding weight will then not be sharded. Alternatively, we can add a hack to get around this similar to this part of the code for MPT-1B. |
//Thanks for your quick reply!
It is very strange.
But it still reports the above mismatch error. |
Have you solved the issue? I have the same problem when training with fsdp. |
Thanks for this wonderful project. I used the following script to train the model.
However, if I set the fsdp flag, it will report an error as follows:
location: flamingo.py line 294.
If I remove this flag, there is no error. Do you have any idea about this?
The text was updated successfully, but these errors were encountered: