-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: llama 405B fp8 fails #140
Comments
Hi, which branch are you using? There is no FP8 support for Gaudi on habana_main currently. |
is there a reason for that, i know that the habana documentation claims that it supports fp8? I just got llama 3.1 working on TGI, with a PR request that is being considered for optimum habana, and I was going to subsequently try to load fp8 models into the gaudi 2 system, for use at a stanford law school hackathon that our group Laion AI / OPEA are sponsoring on the 8th. |
I could not find any claims for FP8 support in vLLM in Habana docs, but both of our releases and their READMEs explicitly state lack of FP8 support in Unsupported Features section (1.16.0, 1.17.0). That said, the hardware of Gaudi2 definitely does support FP8, but it was not enabled on habana_main branch. We've brought support for FP8 in vLLM for HPU very recently in #144. |
Closing the issue - FP8 support was added and the reported issue is not reproducible anymore. Please open a new issue if you experience any issues with Llama 405B. |
Your current environment
The text was updated successfully, but these errors were encountered: