-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add parameters support to InferResponse #394
base: main
Are you sure you want to change the base?
Conversation
7b63cc9
to
c3a9c1b
Compare
@@ -58,6 +60,12 @@ InferResponse::OutputTensors() | |||
return output_tensors_; | |||
} | |||
|
|||
std::string& |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be mutable or const
?
const bool is_last_response = true, void* id = nullptr); | ||
std::vector<std::shared_ptr<PbTensor>>& OutputTensors(); | ||
std::string& Parameters(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this parameters string is expected to be JSON serializable (unless empty string?) - can you add some docstrings describing that expectation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a README section for response parameters similar to the one for request parameters? https://github.com/triton-inference-server/python_backend?tab=readme-ov-file#inference-request-parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is added. 816d4ac
src/pb_stub.cc
Outdated
if (!py::isinstance<py::bool_>(pair.second) && | ||
!py::isinstance<py::int_>(pair.second) && | ||
!py::isinstance<py::str>(pair.second)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have any protection against modifying reserved parameter keys such as triton_final_response
from user-code?
* Infer response to track parameters * Add parameters to binding infer response * Rank parameters argument up among InferResponse constructor arguments * Add setting parameters to Triton response * Send response parameters only on non-error * Fix double declaration * Unify py dictionary parameters to json str
c3a9c1b
to
816d4ac
Compare
What does the PR do?
Add support for setting response parameters in regular and decoupled Python backend model response(s).
Checklist
<commit_type>: <Title>
Commit Type:
Check the conventional commit type
box here and add the label to the github PR.
Related PRs:
triton-inference-server/server#7964
Where should the reviewer start?
Start with the test cases on the related server PR, and then
pb_stub.cc
->infer_response.cc
->python_be.cc
.Test plan:
New tests are added to the related server PR.
Caveats:
Responses to BLS models are not populated with the response parameters, if any. (DLIS-7864)
Background
N/A
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
N/A