How to force layernorm to run at FP32 precision with "config->setFlag(BuilderFlag::kFP16);" #3692

w1005444804 · 2024-03-03T11:44:17Z

I just want to use VIT with fp16 precision ，
but [W]Running layernorm after self-attention in FP16 may cause overflow.
Now I hope other layers run in FP16,but layernorm layers run in fp32!
Can you provide some C++examples for manually setting the layernorm layers to run in FP32 with "config->setFlag(BuilderFlag::kFP16);"

onnx opset == 17;
TRT == 8.6,
windows11

[W] [TRT] Detected layernorm nodes in FP16: /neck/neck.1/ReduceMean_1, /neck/neck.3/ReduceMean_1, /neck/neck.1/Sqrt, /neck/neck.3/Sqrt, /neck/neck.1/Pow, /neck/neck.3/Add, /neck/neck.1/Add_1, /neck/neck.3/Sub, /neck/neck.3/Div, /neck/neck.1/Mul, /neck/neck.1/Div, /neck/neck.3/Add_1, /neck/neck.3/Pow, /neck/neck.1/Add, /neck/neck.3/Mul, /neck/neck.1/Sub
[03/03/2024-21:53:14] [W] [TRT] Running layernorm after self-attention in FP16 may cause overflow. Exporting the model to the latest available ONNX opset (later than opset 17) to use the INormalizationLayer, or forcing layernorm layers to run in FP32 precision can help with preserving accuracy.

so, How to force layernorm to run at FP32 precision with "config->setFlag(BuilderFlag::kFP16);

nvluxiaoz · 2024-03-06T23:09:54Z

Some example here in HF demo: https://github.com/NVIDIA/TensorRT/blob/release/9.3/demo/HuggingFace/T5/export.py#L72. You also need to set obey precision. An example is: https://github.com/NVIDIA/TensorRT/blob/release/9.3/demo/HuggingFace/NNDF/models.py#L139

w1005444804 · 2024-03-07T05:57:27Z

@nvluxiaoz Thank you so much for your help. I really appreciate it.

zerollzeng · 2024-03-08T09:16:50Z

Looks like problem solved, can we close this? Thanks!

ttyio · 2024-04-16T18:28:05Z

Closing, thanks all

zerollzeng self-assigned this Mar 8, 2024

zerollzeng added the triaged Issue has been triaged by maintainers label Mar 8, 2024

ttyio closed this as completed Apr 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to force layernorm to run at FP32 precision with "config->setFlag(BuilderFlag::kFP16);" #3692

How to force layernorm to run at FP32 precision with "config->setFlag(BuilderFlag::kFP16);" #3692

w1005444804 commented Mar 3, 2024 •

edited

Loading

nvluxiaoz commented Mar 6, 2024

w1005444804 commented Mar 7, 2024

zerollzeng commented Mar 8, 2024

ttyio commented Apr 16, 2024

How to force layernorm to run at FP32 precision with "config->setFlag(BuilderFlag::kFP16);" #3692

How to force layernorm to run at FP32 precision with "config->setFlag(BuilderFlag::kFP16);" #3692

Comments

w1005444804 commented Mar 3, 2024 • edited Loading

nvluxiaoz commented Mar 6, 2024

w1005444804 commented Mar 7, 2024

zerollzeng commented Mar 8, 2024

ttyio commented Apr 16, 2024

w1005444804 commented Mar 3, 2024 •

edited

Loading