How to simply route requests with the same request id to the same model instance? #7861
Unanswered
fighterhit
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
How can I simply route all inference requests with the same request id to the same model instance, and then execute inference using dynamic_batching? It sounds like this can be achieved by using a stateful model to change the request id to a sequence id, but it feels too complicated and requires additional control input, because I only need the ability to route to the same instance and dynamic batches. Is there an easy solution? Thanks!
Beta Was this translation helpful? Give feedback.
All reactions