-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in the DDP mode #3
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
when using the DDP mode to train the model, it would raise the error of "This error indicates that your module has parameters that were not used in producing loss".
Since the minilm model only uses the attention parameters, so the parameters of student model like "bert.encoder.layer.-1.output.xx" and "cls.predictions.transform.xx" would have no gradient updates. So how to fix this problem? thanks.
The text was updated successfully, but these errors were encountered: