EXL2 support prospects / options for non NV/AMD GPUs, CPU e.g. Vulkan, Intel GPU, OpenCL, SYCL, CPU, et. al.? #327
Unanswered
ghchris2021
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
Congratulations on the nice project!
I keep hearing good things about it and EXL2 as a format but have yet to try because I'd like to be able to use EXL2 on a mix of GPUs (including Intel, NVIDIA) and also have the ability for CPU offloading to work when available VRAM alone is insufficient (sadly a significant use case).
So presently my ideal case would be for some project / option to support good inferencing with EXL2 maybe using whatever varieties of e.g. Vulkan, Intel GPU, OpenCL, SYCL, CPU based calculations alone or intermixed with what's already supported for NV (and AMD I guess?) GPUs.
Do you know of any projects / plans by which EXL2 might be supported in any such way in the envisioned future?
I've seen other projects that support various other formats & quantizations like under GGUF, GPTQ, and a few other quantized formats on CPU and with a variety of makers of GPUs so I wonder how different (wrt. the half dozen other quantized formats out there supported variously) the adaptation would be to also handle EXL2 with some modest / similar efficiency? Perhaps it'd be a small adaptation (or not) but just nothing that's been entertained here or in other interence engine projects?
Beta Was this translation helpful? Give feedback.
All reactions