You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear Triton Team, thank you for developing such an exceptional package that facilitates cloud inference. We are a group from High Energy Physics Experiments, looking to leverage your inference-as-a-service model to manage complex inference pipelines on remote GPUs. We are particularly interested in an efficient custom backend capable of supporting our extensive multi-module pipelines.
I would like to inquire if it is possible to implement a generic custom backend template similar to the TritonPythonModel available in the Python backend. This template would allow developers to focus solely on defining the initialization, execution, and finalization functions without having to manage the intricacies of the backend API, such as device selection.
Could we explore the feasibility of this proposal? I am ready and willing to volunteer my time to assist in the development of this feature.
To illustrate, I envision a base class structured as follows:
TRITONBACKEND_ModelInstanceExecute(){
BaseCustomBackend* customPipeline = newCustomPipeline();
// Assuming some inputs are prepared for the pipeline// handled by triton memory BackendInputCollector
std::vector<int> inputs = {1, 2, 3}; // Example inputs// Execute the pipeline
std::vector<int> outputs = customPipeline->runPipeline(inputs);
// prepare responses ...BackendOutputResponder
}
Is your feature request related to a problem? Please describe.
Yes, our aim is to reduce the complexity of developing custom backends for intricate pipelines, thereby boosting both efficiency and usability.
Describe the solution you'd like
A templated custom backend that abstracts lower-level details and allows developers to focus on pipeline-specific logic.
Describe alternatives you've considered
While other solutions may be considered, integrating a templated approach directly within Triton could substantially streamline development efforts.
Additional context
I am available to discuss this proposal further and provide additional use cases or details as needed.
Dear Triton Team, thank you for developing such an exceptional package that facilitates cloud inference. We are a group from High Energy Physics Experiments, looking to leverage your inference-as-a-service model to manage complex inference pipelines on remote GPUs. We are particularly interested in an efficient custom backend capable of supporting our extensive multi-module pipelines.
I would like to inquire if it is possible to implement a generic custom backend template similar to the
TritonPythonModel
available in the Python backend. This template would allow developers to focus solely on defining the initialization, execution, and finalization functions without having to manage the intricacies of the backend API, such as device selection.Could we explore the feasibility of this proposal? I am ready and willing to volunteer my time to assist in the development of this feature.
To illustrate, I envision a base class structured as follows:
And within the backend code:
Is your feature request related to a problem? Please describe.
Yes, our aim is to reduce the complexity of developing custom backends for intricate pipelines, thereby boosting both efficiency and usability.
Describe the solution you'd like
A templated custom backend that abstracts lower-level details and allows developers to focus on pipeline-specific logic.
Describe alternatives you've considered
While other solutions may be considered, integrating a templated approach directly within Triton could substantially streamline development efforts.
Additional context
I am available to discuss this proposal further and provide additional use cases or details as needed.
tagging my colleagues @xju2 @ytchoutw @yongbinfeng @kpedro88
The text was updated successfully, but these errors were encountered: