Skip to content
This repository has been archived by the owner on Apr 4, 2024. It is now read-only.

Solution to loss explosion #13

Open
fkeufss opened this issue Jul 30, 2022 · 3 comments
Open

Solution to loss explosion #13

fkeufss opened this issue Jul 30, 2022 · 3 comments

Comments

@fkeufss
Copy link

fkeufss commented Jul 30, 2022

Thank you for sharing your code. I am trying your code and I do find the loss explosion problem. Do you know the inherent reason of it? Is there any better solution instead of restarting training with lower learning rate every time manually?

@TomTomTommi
Copy link
Owner

Hi, thanks for your interest. Actually, this problem occurs frequently and deserves further study, but I have not analyzed it.

@Hatermelon
Copy link

Thank you for sharing your code. I am trying your code and I do find the loss explosion problem. Do you know the inherent reason of it? Is there any better solution instead of restarting training with lower learning rate every time manually?

Hello, can you continue the training normally after modifying the parameters manually? I am using the manual method to modify the loss explosion problem for the first time, why after modifying the learning rate and other parameters according to the method, the model re the first round started and did not continue for 500 epochs, the learning rate did not change according to the modifications, is it something I have overlooked? Thank you.

@lyq2335458686
Copy link

Hello, when I run your code, I obviously downloaded CUDA, but why can't I call the GPU when running?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants