-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training speed #18
Comments
In addition, it seems that there is one node always idle during forward pass. |
HI, If I remember correctly, it was a bit faster to run this code, i.e. around one hour per epoch. |
Thanks for your reply. It is indeed the IO slow down the training. Currently, I have to store the data on the cloud and cannot use local SSD. Do you have any suggestions to make the training possible with a cloud computing system like azure? Also I wonder that will the center crop option reduces the performance? |
Thanks for this nice work. Could you provide a rough estimation of the running time for this implementation?
Currently, it takes around 2.5 hours to train one epoch and seems much slower than the normal case. (Total batch size 2048, 4 x 8V100, 32G)
Thank you!
The text was updated successfully, but these errors were encountered: