-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
quantization process takes too long #35
Comments
Hi, i encountered OOM problem when running step 3 on a single L40s gpu. How did you solve it? |
Maybe because my flux model is smaller, only 8B, and my L40s gpu has 45GB memory, it is enough. |
Thanks for reply, i also use pixart-sigma, 800M parameters, encountered OOM on 45GB L40s gpu too. I use the original config yaml, should i modify it? |
there is a
in the config file, maybe the batch size is too large for your device. |
thanks for your advice. i run sdxl-turbo on my gpus, but it costs many hours. Waiting for parallel acceleration too :) |
Hi
Hello @yyfcc17 , I didn't see where we can set the gpu number in the config files. May I ask how you set this? |
it's in |
during the ptq process, it only use 1 gpu, although i have set 4 gpus in the config file (i have 4 gpus installed).
it takes almost 52 hours to quantize my 8B flux model (the Step 3 in your readme) using a L40s gpu, is this normal?
can we accelerate the ptq process by quantizing blocks parallelly on different gpus?
thanks.
The text was updated successfully, but these errors were encountered: