Triton inference server and inference on different machines. #5769
Unanswered
prasad-nair
asked this question in
Q&A
Replies: 2 comments
-
@prasad-nair You might need a container orchestration tool setup (Kubernetes) where you can have instances of Triton on different devices which is being managed as worker node by kubernetes master node server. Apart from it, other ways would add latency to inference time hence might not be useful. (but like to know more about them) |
Beta Was this translation helpful? Give feedback.
0 replies
-
I recalled we might be able to add more than one model repos to the same triton server. With this if u are hosting models in another device and able to mount as one repo in triton, it might be possible? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Is it possible to run Triton on one device and do inferences of models on other devices.
Beta Was this translation helpful? Give feedback.
All reactions