Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quantization process takes too long #35

Open
yyfcc17 opened this issue Dec 10, 2024 · 7 comments
Open

quantization process takes too long #35

yyfcc17 opened this issue Dec 10, 2024 · 7 comments

Comments

@yyfcc17
Copy link

yyfcc17 commented Dec 10, 2024

during the ptq process, it only use 1 gpu, although i have set 4 gpus in the config file (i have 4 gpus installed).

it takes almost 52 hours to quantize my 8B flux model (the Step 3 in your readme) using a L40s gpu, is this normal?

can we accelerate the ptq process by quantizing blocks parallelly on different gpus?

thanks.

@senlyu163
Copy link

Hi, i encountered OOM problem when running step 3 on a single L40s gpu. How did you solve it?

@yyfcc17
Copy link
Author

yyfcc17 commented Dec 11, 2024

Maybe because my flux model is smaller, only 8B, and my L40s gpu has 45GB memory, it is enough.

@senlyu163
Copy link

Maybe because my flux model is smaller, only 8B, and my L40s gpu has 45GB memory, it is enough.

Thanks for reply, i also use pixart-sigma, 800M parameters, encountered OOM on 45GB L40s gpu too. I use the original config yaml, should i modify it?

@yyfcc17
Copy link
Author

yyfcc17 commented Dec 11, 2024

there is a

batch_size: 256

in the config file, maybe the batch size is too large for your device.

@senlyu163
Copy link

thanks for your advice. i run sdxl-turbo on my gpus, but it costs many hours. Waiting for parallel acceleration too :)

@Chenglin-Yang
Copy link

Hi

during the ptq process, it only use 1 gpu, although i have set 4 gpus in the config file (i have 4 gpus installed).

it takes almost 52 hours to quantize my 8B flux model (the Step 3 in your readme) using a L40s gpu, is this normal?

can we accelerate the ptq process by quantizing blocks parallelly on different gpus?

thanks.

Hello @yyfcc17 , I didn't see where we can set the gpu number in the config files. May I ask how you set this?

@yyfcc17
Copy link
Author

yyfcc17 commented Dec 13, 2024

Hi

during the ptq process, it only use 1 gpu, although i have set 4 gpus in the config file (i have 4 gpus installed).
it takes almost 52 hours to quantize my 8B flux model (the Step 3 in your readme) using a L40s gpu, is this normal?
can we accelerate the ptq process by quantizing blocks parallelly on different gpus?
thanks.

Hello @yyfcc17 , I didn't see where we can set the gpu number in the config files. May I ask how you set this?

it's in examples/diffusion/configs/__default__.yaml, seems only for evaluation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants