Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loss not decreasing on default config settings #43

Open
codefreakSubham opened this issue Oct 7, 2022 · 0 comments
Open

Loss not decreasing on default config settings #43

codefreakSubham opened this issue Oct 7, 2022 · 0 comments

Comments

@codefreakSubham
Copy link

codefreakSubham commented Oct 7, 2022

@rowanz
Hi, I am trying to train the model from scratch, but am not able to reproduce the actual results. Specifically the loss is not decreasing in each epoch. I ran it for 20 epochs and the results are below. Anyone faced such an issue or know the possible reason for this? Any kind of suggestions will be of great help. Thank you.

TRAIN EPOCH 0:
loss 1.356284
crl 0.144345
accuracy 0.311996
sec_per_batch 1.702358
hr_per_epoch 1.048369
dtype: float64

Val epoch 0 has acc 0.249 and loss 1.386
Best validation performance so far. Copying weights to 'saves/flagship_rationale/best.th'.

TRAIN EPOCH 1:
loss 1.386393
crl 0.089470
accuracy 0.249471
sec_per_batch 2.008696
hr_per_epoch 1.237022
dtype: float64

Val epoch 1 has acc 0.249 and loss 1.386

TRAIN EPOCH 2:
loss 1.386381
crl 0.075422
accuracy 0.251220
sec_per_batch 1.946174
hr_per_epoch 1.198519
dtype: float64

Epoch 2: reducing learning rate of group 0 to 1.0000e-04.
Val epoch 2 has acc 0.249 and loss 1.386

TRAIN EPOCH 3:
loss 1.386379
crl 0.050537
accuracy 0.248640
sec_per_batch 1.870728
hr_per_epoch 1.152057
dtype: float64

Val epoch 3 has acc 0.249 and loss 1.386

TRAIN EPOCH 4:
loss 1.386330
crl 0.042339
accuracy 0.250779
sec_per_batch 2.006369
hr_per_epoch 1.235589
dtype: float64

Val epoch 4 has acc 0.249 and loss 1.386

TRAIN EPOCH 5:
loss 1.386332
crl 0.037035
accuracy 0.250581
sec_per_batch 1.735174
hr_per_epoch 1.068578
dtype: float64

Val epoch 5 has acc 0.249 and loss 1.386

TRAIN EPOCH 6:
loss 1.386333
crl 0.032566
accuracy 0.249394
sec_per_batch 2.384569
hr_per_epoch 1.468497
dtype: float64

Epoch 6: reducing learning rate of group 0 to 5.0000e-05.
Val epoch 6 has acc 0.249 and loss 1.386

TRAIN EPOCH 7:
loss 1.386345
crl 0.020694
accuracy 0.247829
sec_per_batch 2.088539
hr_per_epoch 1.286192
dtype: float64

Val epoch 7 has acc 0.249 and loss 1.386

TRAIN EPOCH 8:
loss 1.386309
crl 0.017643
accuracy 0.251004
sec_per_batch 1.965981
hr_per_epoch 1.210717
dtype: float64

Val epoch 8 has acc 0.249 and loss 1.386

TRAIN EPOCH 9:
loss 1.386299
crl 0.015537
accuracy 0.251415
sec_per_batch 1.872479
hr_per_epoch 1.153135
dtype: float64

Val epoch 9 has acc 0.249 and loss 1.386

TRAIN EPOCH 10:
loss 1.386302
crl 0.014494
accuracy 0.251420
sec_per_batch 1.644809
hr_per_epoch 1.012928
dtype: float64

Epoch 10: reducing learning rate of group 0 to 2.5000e-05.
Val epoch 10 has acc 0.249 and loss 1.386

TRAIN EPOCH 11:
loss 1.386306
crl 0.009551
accuracy 0.252025
sec_per_batch 1.408009
hr_per_epoch 0.867099
dtype: float64

Val epoch 11 has acc 0.249 and loss 1.386

TRAIN EPOCH 12:
loss 1.386314
crl 0.007876
accuracy 0.250382
sec_per_batch 1.419217
hr_per_epoch 0.874001
dtype: float64

Val epoch 12 has acc 0.249 and loss 1.386

TRAIN EPOCH 13:
loss 1.386337
crl 0.007333
accuracy 0.248957
sec_per_batch 1.800047
hr_per_epoch 1.108529
dtype: float64

Val epoch 13 has acc 0.249 and loss 1.386

TRAIN EPOCH 14:
loss 1.386308
crl 0.006972
accuracy 0.251202
sec_per_batch 1.691500
hr_per_epoch 1.041682
dtype: float64

Epoch 14: reducing learning rate of group 0 to 1.2500e-05.
Val epoch 14 has acc 0.249 and loss 1.386

TRAIN EPOCH 15:
loss 1.386294
crl 0.004941
accuracy 0.250033
sec_per_batch 1.976553
hr_per_epoch 1.217227
dtype: float64

Val epoch 15 has acc 0.249 and loss 1.386

TRAIN EPOCH 16:
loss 1.386299
crl 0.004361
accuracy 0.250594
sec_per_batch 2.385966
hr_per_epoch 1.469357
dtype: float64

Val epoch 16 has acc 0.249 and loss 1.386

TRAIN EPOCH 17:
loss 1.386329
crl 0.004206
accuracy 0.249658
sec_per_batch 2.463118
hr_per_epoch 1.516870
dtype: float64

Val epoch 17 has acc 0.249 and loss 1.386

TRAIN EPOCH 18:
loss 1.386311
crl 0.003819
accuracy 0.249090
sec_per_batch 2.041939
hr_per_epoch 1.257494
dtype: float64

Epoch 18: reducing learning rate of group 0 to 6.2500e-06.
Val epoch 18 has acc 0.249 and loss 1.386

TRAIN EPOCH 19:
loss 1.386334
crl 0.003092
accuracy 0.249248
sec_per_batch 1.784414
hr_per_epoch 1.098902
dtype: float64

Val epoch 19 has acc 0.249 and loss 1.386

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant