-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
maybe bugs in loss backward? #15
Comments
also confuse me, and a related issue: self.forward(x, optimizer_idx=1) will call encoder&decoder again on the same x, why not share with self.forward(x, optimizer_idx=0) |
@shinshiner Hi, I found that the usual method under similar frameworks is:
I think this way of writing and the author's practice is not fundamentally different. The order of calculation in pytorch lightning is: Loss.backward (), opt.step(), opt.zero_grad (). |
I also noticed this repo, and agree with you. I think we should use requires_grad = False for discriminator when computing generator loss. |
Hello, thanks for your great work. When exploring the code, I found something confusing and want to sure whether it is a bug.
In https://github.com/FoundationVision/OmniTokenizer/blob/main/OmniTokenizer/omnitokenizer.py#L548, you generate fake sample and let it goes through the generator and discriminator to compute the adversial loss of generator, when backward this loss, the weights of discriminator will also accumulate grads of generator loss.
However, since you use grad accumulation, the optimizer will not zero grads before backward, then we will update the discriminator using some grads from generator loss!
Is it a bug? Or you write it due to some inside mathematics?
The text was updated successfully, but these errors were encountered: