-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vulkan Stable Diffusion Operators #904
Vulkan Stable Diffusion Operators #904
Conversation
@ggerganov I fixed two bugs while implementing this (fd01e5d and ecc1f51), can I just cherry-pick those into a llama.cpp PR or would that cause issues with the repo synchronization? Edit: Also 577b132 |
It's easier to merge in one repo and sync to others. But if it's high priority you can cherry pick in llama.cpp and Ill resolve later |
It doesn't seem to cause any significant issue on llama.cpp, so I'll wait for a sync unless someone opens an issue that would be fixed by this. |
Btw, does this fix the following tests: |
It should, yes. When refactoring the shader code into files I set a preprocessor value incorrectly, which caused matmuls to fail when k is not divisible by 8. |
I think I caught all of the major issues now, stable-diffusion.cpp works with Vulkan with these changes on AMD and Nvidia. |
It doesn't look ready yet, the latest commit crashes every time for me with settings that worked before:
|
Please always add what model you are running and what command you called it with. |
My bad. Didn't have time to test thoroughly at the time. After some further testing I've determined the source of the problem to be quantization. Here is an example command:
Log:
|
@SkutteOleg Thank you for the report, I messed up one of the conditions for selecting an quantized matmul shader. That's fixed now, can you try again? |
I forgot to check img2img, |
Works great, thank you! Also the issue I was having where 1024x1024 would produce broken outputs is gone. I also was having an issue where Vulkan was looking too blotchy and noisy compared to CUDA12 and it is fixed as well to the point where CUDA12 images look noisier to me now. All my use cases are covered, great job! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, should we proceed with merge?
I will add LEAKY_RELU (leejet/stable-diffusion.cpp#291 (comment)) in the next few hours, then we can merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can confirm that txt2img and img2img is working fine on vulkan
I implemented the Operators necessary for stable-diffusion.cpp to run using Vulkan. The corresponding PR is leejet/stable-diffusion.cpp#291
Image generation works now, but I want add some minor stuff for LORA/TAESD (leejet/stable-diffusion.cpp#291 (comment)), run further tests to make sure everything works, and maybe do some performance checks and optimizations before setting this to ready.