-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FP8 AllGather Support in Fairscale #1185
base: ngoyal_changes_for_pp_fp8_jiecaoyu_debug
Are you sure you want to change the base?
FP8 AllGather Support in Fairscale #1185
Commits on Mar 29, 2024
-
added option for no PG validation for faster init (#1161)
Co-authored-by: Naman Goyal <naman@fb.com>
Configuration menu - View commit details
-
Copy full SHA for 73ce4b4 - Browse repository at this point
Copy the full SHA 73ce4b4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 33457b3 - Browse repository at this point
Copy the full SHA 33457b3View commit details -
This commit works with a 4 GPU run on SMALL model with FSDP and PP enabled.
Configuration menu - View commit details
-
Copy full SHA for 70f5ff5 - Browse repository at this point
Copy the full SHA 70f5ff5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 16c682d - Browse repository at this point
Copy the full SHA 16c682dView commit details
Commits on Apr 1, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 24a769f - Browse repository at this point
Copy the full SHA 24a769fView commit details
Commits on Apr 2, 2024
-
Fix
main_grad
attribute checking.- Clean up flatten and non_flatten parameter generation logic. - Avoid checking `main_grad` attribute all equal to zeros.
Configuration menu - View commit details
-
Copy full SHA for 3e2e77f - Browse repository at this point
Copy the full SHA 3e2e77fView commit details
Commits on Apr 9, 2024
-
- Cleans up amax and scale update logic. Amax and scale should be done for both weights and parameters. So it should be done at forward of each microbatch. - Consolidate `cast_params` and `all_gather` stream.
Configuration menu - View commit details
-
Copy full SHA for 1be7aa0 - Browse repository at this point
Copy the full SHA 1be7aa0View commit details
Commits on Apr 10, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 57eb557 - Browse repository at this point
Copy the full SHA 57eb557View commit details
Commits on Apr 17, 2024
-
Configuration menu - View commit details
-
Copy full SHA for e9e8f8e - Browse repository at this point
Copy the full SHA e9e8f8eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 21f8e05 - Browse repository at this point
Copy the full SHA 21f8e05View commit details
Commits on May 20, 2024
-
added option for no PG validation for faster init (#1161)
Co-authored-by: Naman Goyal <naman@fb.com>
Configuration menu - View commit details
-
Copy full SHA for 8ec7c1d - Browse repository at this point
Copy the full SHA 8ec7c1dView commit details -
Configuration menu - View commit details
-
Copy full SHA for f27ab17 - Browse repository at this point
Copy the full SHA f27ab17View commit details -
This commit works with a 4 GPU run on SMALL model with FSDP and PP enabled.
Configuration menu - View commit details
-
Copy full SHA for fa9cf77 - Browse repository at this point
Copy the full SHA fa9cf77View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6fa19e0 - Browse repository at this point
Copy the full SHA 6fa19e0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 80ffd54 - Browse repository at this point
Copy the full SHA 80ffd54View commit details -
Fix
main_grad
attribute checking.- Clean up flatten and non_flatten parameter generation logic. - Avoid checking `main_grad` attribute all equal to zeros.
Configuration menu - View commit details
-
Copy full SHA for afb2ca1 - Browse repository at this point
Copy the full SHA afb2ca1View commit details -
- Cleans up amax and scale update logic. Amax and scale should be done for both weights and parameters. So it should be done at forward of each microbatch. - Consolidate `cast_params` and `all_gather` stream.
Configuration menu - View commit details
-
Copy full SHA for 25b2322 - Browse repository at this point
Copy the full SHA 25b2322View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0d1502b - Browse repository at this point
Copy the full SHA 0d1502bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5edb109 - Browse repository at this point
Copy the full SHA 5edb109View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2df199f - Browse repository at this point
Copy the full SHA 2df199fView commit details -
Merge branch 'shikaili_fp8_allgather_no_pp_fix' of github.com:faceboo…
…kresearch/fairscale into shikaili_fp8_allgather_no_pp_fix
Configuration menu - View commit details
-
Copy full SHA for da36e31 - Browse repository at this point
Copy the full SHA da36e31View commit details