ONNX GPT Inference on two devices #1234

11721206 · 2024-06-25T12:26:55Z

I export ONNX model and inference, I found " CUDA kernel not found in registries for Op type: ScatterND" and Memcpy From Host and to Host, I have issue in onnxruntime repo here ：microsoft/onnxruntime#21148 , and still do not solve it. In python InferenceSession, I print the sess.get_providers() result, and show ["CUDAExecutionProvider", "CPUExecutionProvider"], it mean some ops work on CPU. Is there any sugeestion for my problems? Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ONNX GPT Inference on two devices #1234

ONNX GPT Inference on two devices #1234

11721206 commented Jun 25, 2024 •

edited

Loading

ONNX GPT Inference on two devices #1234

ONNX GPT Inference on two devices #1234

Comments

11721206 commented Jun 25, 2024 • edited Loading

11721206 commented Jun 25, 2024 •

edited

Loading