Training methods for rpuconfig #600
-
Hello! I am trying to train the BERT model using different weight noise modifiers. In the example 24, ADD_NORMAL was used with a std_dev specified in the wandb. I tried to swap the modifier type to ADD_POLY and give a list of coeffs. But it doesn't seem to work because no matter what coefficients were used, the training losses are pretty much the same. Please help!! Thank you!! These are what I changed.
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
I believe you have to set std_dev as well when using the coefficients as it acts as an additional scale. |
Beta Was this translation helpful? Give feedback.
I believe you have to set std_dev as well when using the coefficients as it acts as an additional scale.