New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Could TransformerEngine work with Deepspeed Zero w/ offloading? #762

Open

leiwen83 opened this issue Apr 9, 2024 · 1 comment

Labels

question

leiwen83 commented Apr 9, 2024 •

edited

Loading

Hi,

Since it is common to use with deepspeed zero w/ offloading when training large LLM, does TE currently support in this mode?

Currently deepspeed support is just unittest as refered by TE's readme: microsoft/DeepSpeed#3731

Thx~

ptrendx added question labels

Contributor

sbhavani commented Sep 5, 2024

@leiwen83 I'd recommend using https://github.com/huggingface/accelerate/tree/main/benchmarks/fp8 which has an example with DS ZeRO 1-3 support. Please let us know if it's missing any features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment