You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently there have been papers related to policy collapse and loss of plasticity in Reinforcement Learning suggesting that the default values for the Adam betas in PyTorch (b1=0.9, b2=0.999) are not ideal and pretty much arbitrary, and I noticed that this is the case here also.
This paper suggests using b1=b2 for better results.
This is more of a discussion than an issue tho, my testing seems to agree with the paper (for reference I used b1=b2=0.9), both using my own env and using gym envs like the cartpole problem. I do not know however how relevant this is outside of RL.
The text was updated successfully, but these errors were encountered:
Mmmh, changing the default might break people convergence. I remember that I struggled a lot for the Shakespeare RNN due to using random tensor initialized with a normal distribution with 0.10 instead of 0.50 or something in that vein.
What at least can be done would be to improve the documentation and mention the paper.
Now, given that the NN part of Arraymancer has stagnated for ~5 years, maybe it's OK to change it.
Recently there have been papers related to policy collapse and loss of plasticity in Reinforcement Learning suggesting that the default values for the Adam betas in PyTorch (b1=0.9, b2=0.999) are not ideal and pretty much arbitrary, and I noticed that this is the case here also.
This paper suggests using b1=b2 for better results.
This is more of a discussion than an issue tho, my testing seems to agree with the paper (for reference I used b1=b2=0.9), both using my own env and using gym envs like the cartpole problem. I do not know however how relevant this is outside of RL.
The text was updated successfully, but these errors were encountered: