You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
BLS mode calls a TensorRT backend model hundreds of times, and the processing time increases as the number of CPU cores decreases Triton Information
nvcr.io/nvidia/tritonserver:24.05-py3
To Reproduce
My BLS code looks like this: model.py in BLS calls t2s_sdec, platform: "tensorrt_plan"
As the for loop increases, the input gradually becomes larger
Set in t2s sdec/config.json parameters: {key: "FORCE_CPU_ONLY_INPUT_TENSORS" value: {string_value:"no"}}
When the number of CPU cores is 100, 387 times , the totaltime is 2s, the other time is 300ms
When the number of CPU cores is 24, 387 times, the totaltime is 5s, the other time is 600ms
The change in number of CPU cores is set when docker is started --cpuset-cpus=0-23
There is no interference from other processes Expected behavior
I hope the decrease in the number of CPU cores will not affect the overall process time
The text was updated successfully, but these errors were encountered:
Description
BLS mode calls a TensorRT backend model hundreds of times, and the processing time increases as the number of CPU cores decreases
Triton Information
nvcr.io/nvidia/tritonserver:24.05-py3
To Reproduce
![341697488-bff9ee2e-8dee-4261-a41b-3be820873f7f](https://private-user-images.githubusercontent.com/146302419/342746543-a10c350e-a4ab-49d5-a93e-83d4092d9405.jpg?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjAxMTU5MDgsIm5iZiI6MTcyMDExNTYwOCwicGF0aCI6Ii8xNDYzMDI0MTkvMzQyNzQ2NTQzLWExMGMzNTBlLWE0YWItNDlkNS1hOTNlLTgzZDQwOTJkOTQwNS5qcGc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzA0JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwNFQxNzUzMjhaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0zZWI3Y2EwMGVmYTY5YjgwODhjOWY0NThmOGE2N2FkMjBiMmRhMDkxMTg2MDVlYjI5OTI2YjZjY2JjMzE0MjkyJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.At9QjC-ELUa0TYiA-4NqW04Gl6F6fvxX0k8cRzARUpk)
My BLS code looks like this: model.py in BLS calls t2s_sdec, platform: "tensorrt_plan"
model transformation
As the for loop increases, the input gradually becomes larger
Set in t2s sdec/config.json parameters: {key: "FORCE_CPU_ONLY_INPUT_TENSORS" value: {string_value:"no"}}
When the number of CPU cores is 100, 387 times , the totaltime is 2s, the other time is 300ms
When the number of CPU cores is 24, 387 times, the totaltime is 5s, the other time is 600ms
The change in number of CPU cores is set when docker is started --cpuset-cpus=0-23
There is no interference from other processes
Expected behavior
I hope the decrease in the number of CPU cores will not affect the overall process time
The text was updated successfully, but these errors were encountered: