Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ONNX GPT Inference on two devices #1234

Open
11721206 opened this issue Jun 25, 2024 · 0 comments
Open

ONNX GPT Inference on two devices #1234

11721206 opened this issue Jun 25, 2024 · 0 comments

Comments

@11721206
Copy link

11721206 commented Jun 25, 2024

I export ONNX model and inference, I found " CUDA kernel not found in registries for Op type: ScatterND" and Memcpy From Host and to Host, I have issue in onnxruntime repo here :microsoft/onnxruntime#21148 , and still do not solve it. In python InferenceSession, I print the sess.get_providers() result, and show ["CUDAExecutionProvider", "CPUExecutionProvider"], it mean some ops work on CPU. Is there any sugeestion for my problems? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant