-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot reproduce model results #20
Comments
I'm not sure why this happened, it's never happened for me. This was using weights that you trained, correct?, and not the downloaded weights? Maybe it was just a particularly bad initialization and the training got stuck. Have you tried again and run into the same issue? Have you tried with cuDNN? |
I wasn't able to get cuDNN working. I get the message:
I used the weights that I trained. If I use the downloaded weights, then it works! :) But really I would like to train it myself (I intend on experimenting with some minor variants). I will try retraining, but it will take my GPU ~5 days to reply with the results. |
Hmmm yeah not sure about the cuDNN issue, but if you can get that working on your machine, it should speed up training a lot. If you end up trying to train again, I would run |
I tried training again. I think it was almost a day or two faster this time, for some reason. Adding
The Model MSE was slightly better:
But something is still clearly wrong: |
Here is the full output:
It seems to dramatically diverge after 50 epochs? |
That's really weird, I'm not sure what's going on and have never seen that. The loss should drop a lot quicker than that and the diversion is also weird. I would start with just maybe one training example and make sure you can train to get essentially zero error with that, which should happen quickly so you can easily experiment. |
@Faur |
Sorry I don't remember, and I don't have access to the same setup that I had back then |
Changing the seed during the training changes the color of the output as well as the mse. Tried bunch of seeds, yet cannot achieve the balanced image. |
@nistha21 It seems like there is an issue with TimeDistributed in Keras 2, where it overrides the initial weights of the layer to be wrapped (keras-team/keras#8895). In our case, this results in a meaningless loss function. I adjusted the code for this (9f6482e). Give this a shot and let me know if you still have issues - thanks! |
I wasn't getting the strange colourisation of output, but I was getting some poorer quality output before the latest commit (9f6482e). Compare the results from before and after: So just wanted to say thanks to bill-lotter for fixing the issue, and to encourage anyone with issues to try again with the latest commit. |
Hello, I have recently noticed the commit for the TimeDistributed keras 2.0 syntax fix. Before the fix, my loss plots looked relatively alright on my data. However, after implementing the fix, my loss plots have a strange behavior in that they jump significantly partway through the training and do not fall to the lowest loss at the end of training. The loss is also higher than prior to the fix. Have you seen anything of this sort? Do you think that this behavior could be circumvented simply by longer training? Thanks! |
After running all of the steps in the README.md verbatim, the
prediction_scores.txt
contains:and all of the generated plots contain only two colours (pink and blue), for example, here are the first five:
I am using Python 2.7 and Theano==0.9.0 (GPU with pygpu==0.6.2) and Keras==1.0.8 (it seems higher versions are incompatible [see #18]). I am not using cuDNN.
The text was updated successfully, but these errors were encountered: