Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could TransformerEngine work with Deepspeed Zero w/ offloading? #762

Open
leiwen83 opened this issue Apr 9, 2024 · 1 comment
Open

Could TransformerEngine work with Deepspeed Zero w/ offloading? #762

leiwen83 opened this issue Apr 9, 2024 · 1 comment
Labels
question Further information is requested

Comments

@leiwen83
Copy link

leiwen83 commented Apr 9, 2024

Hi,

Since it is common to use with deepspeed zero w/ offloading when training large LLM, does TE currently support in this mode?

Currently deepspeed support is just unittest as refered by TE's readme: microsoft/DeepSpeed#3731

Thx~

@ptrendx ptrendx added question Further information is requested labels May 16, 2024
@sbhavani
Copy link
Contributor

sbhavani commented Sep 5, 2024

@leiwen83 I'd recommend using https://github.com/huggingface/accelerate/tree/main/benchmarks/fp8 which has an example with DS ZeRO 1-3 support. Please let us know if it's missing any features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants