-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about padFilterWeights op. #3740
Comments
Where did you observe the call? build phase or inference phase? from the name looks like it just pad the weights so that it can fit the format require by a performant kernel, which should be needed. |
I observed this call during the inference phase, and I think it should be because I used the weights and bias of conv2d as inputs to the model, so there is this call before each conv2d layer. Is there a way to do this operation in advance? |
@nvpohanh Is this expected? |
@theNefelibata Could you make the weights/bias constants? Or do they have to be network inputs? |
they have to be inputs. |
Then the padFilterWeights kernels are expected because we need to pad the weights for the Conv kernels to run. If the weights were constants, that could have been done offline. |
Can I do this operation manually? |
If the weights have to be network inputs, is it because you need to change the weights for each inference? Or do you only need to change the weights once and then run multiple inferences with the same set of weights? If the use case is the latter, then I would recommend using the Refit feature instead of marking weights as network inputs: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#refitting-engine-c This would allow you to refit the weights once at runtime and then run multiple inferences with the refitted weights without the need to padFilterWeights for every inference |
I have tried Refit, which can affect the inference speed |
I'm interested about in what user case that the weights has to be changed in each inference, @theNefelibata could you please share you use case? Thanks! |
I am trying alternative solutions of Refit. |
Got it, thanks! Can we close this issue? |
There are many nvifer1:: rt:: cuda:: padFilterWeights calls in my model, and I found that this is before conv2d op. I want to know what this function do and if there is any way to avoid it? thank you.
The text was updated successfully, but these errors were encountered: