You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello!
I've met a bug which is hard to solve.
I've done many modification on your proposed code, and everything is fine.
Last week I do a modification on the original code in a new dataset, and I run the proposed code as a baseline. The original code works fine. Bug the modified code met this bug.
The dataset is not corrupted. And no matter how I check the code and datset
, the loss is NaN when iter<10000. Which the strange thing is when I re-run the original code, the same bug happened. But when I read the last week's original code, the training stage is all fine.
Can you run the original code ? I don't know why the loss =NaN. Can you help me solve this bug? It makes me crazy.
This is my config.yaml ,which is almost similar with yours.
The text was updated successfully, but these errors were encountered:
It seems that there is a bug in your data or your custom dataset. You may try to assert there is no NaN or Inf in your data. I don't know if "the dataset is not corrupted" you mentioned refers to that. You may debug the code to find where the NaN or Inf first appears.
It seems that there is a bug in your data or your custom dataset. You may try to assert there is no NaN or Inf in your data. I don't know if "the dataset is not corrupted" you mentioned refers to that. You may debug the code to find where the NaN or Inf first appears.
Thanks for you reply. The bug is strange. My dataset has no bug. I run the model again and again. The bug doesn't exist.
Hello!
I've met a bug which is hard to solve.
I've done many modification on your proposed code, and everything is fine.
Last week I do a modification on the original code in a new dataset, and I run the proposed code as a baseline. The original code works fine. Bug the modified code met this bug.
The dataset is not corrupted. And no matter how I check the code and datset
, the loss is NaN when iter<10000. Which the strange thing is when I re-run the original code, the same bug happened. But when I read the last week's original code, the training stage is all fine.
Can you run the original code ? I don't know why the loss =NaN. Can you help me solve this bug? It makes me crazy.
This is my config.yaml ,which is almost similar with yours.
The text was updated successfully, but these errors were encountered: