Bug: [SYCL] Inference not working correctly on multiple GPUs #8294
Labels
bug-unconfirmed
high severity
Used to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)
stale
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
What happened?
I am using Llama.cpp + SYCL to perform inference on a multiple GPU server. However, I get a Segmentation Fault when using multiple GPUs. The same model can produce inference output correctly with single GPU mode.
Output of
./build/bin/llama-ls-sycl-device
:Name and Version
./llama-cli --version version: 3292 (20fc3804) built with Intel(R) oneAPI DPC++/C++ Compiler 2024.0.1 (2024.0.1.20231122) for x86_64-unknown-linux-gnu
What operating system are you seeing the problem on?
Linux
Relevant log output
The text was updated successfully, but these errors were encountered: