-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ONNX export of integer weights with large models #872
Comments
@costigt-dev @Giuseppe5 Brevitas seem to be using Can also be reproduced with
|
Note: doing the export with export_manager = StdQCDQONNXManager
export_manager.change_weight_export(export_weight_q_node=True)
with torch.no_grad(), brevitas_proxy_export_mode(quantized_model, export_manager=export_manager): instead of simply with torch.no_grad(), brevitas_proxy_export_mode(quantized_model, export_manager=StdQCDQONNXManager): fixes the issue. But this is not a good long-term fix as the serialized model is then ~4x bigger. |
Maybe this could be relevant: |
PyTorch 2.2 has partially fixed this issue: pytorch/pytorch#111097 The problem in Pytorch <2.2 seems to be that constants are not acounted for in the model size computation. cc @costigt-dev |
From my investigations there doesn't appear to be any straightforward way to work around this issue in PyTorch 2.1 or below. |
When trying to export large models, currently we are forced to export QDQ pattern for weights, instead of simply exporting Integer weights -> DQ.
The error seems to be caused by the fact that adding a new node in the graph with integer weights confuses the calculation of the model size during torch export, and then the >2GB error is triggered.
To reproduce, using the optimum-amd flow:
with onnx==1.15.0, torch==2.2.0, brevitas==0.10.2, optimum==1.17.1, optimum-amd from main
Thanks @fxmarty
The text was updated successfully, but these errors were encountered: