You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is just a question regarding about setup FP8 amax_history. I checked the documentation about this parameter, it's clear how to set this and I understand the default value is 1024. However there is little material about how tuning this value would affect training result (I mean loss, and downstream task performance on benchmark). Also this is unclear regarding about how to select the amax algorithm selection.
May i ask for some advise regarding this?
Thanks
The text was updated successfully, but these errors were encountered:
Hey there,
This is just a question regarding about setup FP8 amax_history. I checked the documentation about this parameter, it's clear how to set this and I understand the default value is 1024. However there is little material about how tuning this value would affect training result (I mean loss, and downstream task performance on benchmark). Also this is unclear regarding about how to select the amax algorithm selection.
May i ask for some advise regarding this?
Thanks
The text was updated successfully, but these errors were encountered: