You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just want to use VIT with fp16 precision ,
but [W]Running layernorm after self-attention in FP16 may cause overflow.
Now I hope other layers run in FP16,but layernorm layers run in fp32!
Can you provide some C++examples for manually setting the layernorm layers to run in FP32 with "config->setFlag(BuilderFlag::kFP16);"
onnx opset == 17;
TRT == 8.6,
windows11
[W] [TRT] Detected layernorm nodes in FP16: /neck/neck.1/ReduceMean_1, /neck/neck.3/ReduceMean_1, /neck/neck.1/Sqrt, /neck/neck.3/Sqrt, /neck/neck.1/Pow, /neck/neck.3/Add, /neck/neck.1/Add_1, /neck/neck.3/Sub, /neck/neck.3/Div, /neck/neck.1/Mul, /neck/neck.1/Div, /neck/neck.3/Add_1, /neck/neck.3/Pow, /neck/neck.1/Add, /neck/neck.3/Mul, /neck/neck.1/Sub
[03/03/2024-21:53:14] [W] [TRT] Running layernorm after self-attention in FP16 may cause overflow. Exporting the model to the latest available ONNX opset (later than opset 17) to use the INormalizationLayer, or forcing layernorm layers to run in FP32 precision can help with preserving accuracy.
so, How to force layernorm to run at FP32 precision with "config->setFlag(BuilderFlag::kFP16);
The text was updated successfully, but these errors were encountered:
I just want to use VIT with fp16 precision ,
but [W]Running layernorm after self-attention in FP16 may cause overflow.
Now I hope other layers run in FP16,but layernorm layers run in fp32!
Can you provide some C++examples for manually setting the layernorm layers to run in FP32 with "config->setFlag(BuilderFlag::kFP16);"
onnx opset == 17;
TRT == 8.6,
windows11
[W] [TRT] Detected layernorm nodes in FP16: /neck/neck.1/ReduceMean_1, /neck/neck.3/ReduceMean_1, /neck/neck.1/Sqrt, /neck/neck.3/Sqrt, /neck/neck.1/Pow, /neck/neck.3/Add, /neck/neck.1/Add_1, /neck/neck.3/Sub, /neck/neck.3/Div, /neck/neck.1/Mul, /neck/neck.1/Div, /neck/neck.3/Add_1, /neck/neck.3/Pow, /neck/neck.1/Add, /neck/neck.3/Mul, /neck/neck.1/Sub
[03/03/2024-21:53:14] [W] [TRT] Running layernorm after self-attention in FP16 may cause overflow. Exporting the model to the latest available ONNX opset (later than opset 17) to use the INormalizationLayer, or forcing layernorm layers to run in FP32 precision can help with preserving accuracy.
so, How to force layernorm to run at FP32 precision with "config->setFlag(BuilderFlag::kFP16);
The text was updated successfully, but these errors were encountered: