Layer to support heterogeneous inference #19653
Unanswered
wilderfield
asked this question in
Other Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Does ONNX-runtime support the idea of running inference of a model heterogeneously? For instance, I've heard CoreML does.
Something like:
evaluate the computational graph of a model, the data types used, and the operations required, then determine the most efficient way to execute the model across CPU, GPU, 3rd party accelerator. This includes considerations for power efficiency, computational speed, and memory usage. Dynamically choosing the best processing unit for a given task without explicit instructions from the developer would be a significant advantage, particularly for applications where performance and power efficiency are critical.
Beta Was this translation helpful? Give feedback.
All reactions