-
Notifications
You must be signed in to change notification settings - Fork 518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/quantizer factory refactoring #1963
base: master
Are you sure you want to change the base?
Conversation
… quantizatino params are used
|
||
# TODO: Should this really be here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not?
We can do PTQ/QAT on multiple GPUs (I specifically remember I wrote support for it in the past...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okie, removing todo then)
@@ -0,0 +1,34 @@ | |||
quantizer: | |||
TRTPTQQuantizer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the logic of decoupling the exported from the quantizer...
I feel this is puts more risk for errors (like having the TRT quantizer with int8 and onnx exporter, which migt mislead users to try and run that onnx file on onnruntime(.
There is some benefit for debugging though, I guess...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I think the exporter and quantizer should be part of the same object
output_path: ${architecture}_trt_ptq_int8.onnx | ||
|
||
# Model-specific parameters | ||
export_params: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to properly document all the export params with their corresponding exporters.
I would feel a bit lost if I wanted lets say, try and build a TRT engine on a segmentation model for instance.
src/super_gradients/recipes/quantization_params/default_tensorrt_qat.yaml
Outdated
Show resolved
Hide resolved
detection_max_predictions_per_image: 128 | ||
detection_num_pre_nms_predictions: 1000 | ||
detection_predictions_format: flat | ||
detection_postprocessing_use_tensorrt_nms: False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it pretty confusing that we have an Exporter object, but model specific export_params....
Especially given the fact that there's detection_postprocessing_use_tensorrt_nms argument here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nothing new btw, ExportParams are not introduced in this PR, it's just I've added explicitly all the fields that exists in ExportParams structure to help user start editing these parameters.
I can duplicate docstrings on each setting, but it's going quite long YAML (see ExportParams)
…ntization on instantiation of TRT Quantizers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very impressive.
Not sure I understand the logic.
Why do we convert to onnx and then apply the specific quantazier and exporter?
""" | ||
|
||
@abc.abstractmethod | ||
def export_from_onnx(self, source_onnx: str, output_file: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand the logic of this.
We said that SG will not export the TRT engine, or OpenVino engine. We export an onnx file (even if that file is framework specific)
|
||
model = ptq( | ||
quantized_model = tensorrt_ptq( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand. This is only for TRT, what if I want to convert to something else?
@@ -0,0 +1,34 @@ | |||
quantizer: | |||
TRTPTQQuantizer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I think the exporter and quantizer should be part of the same object
List of breaking changes:
QATRecipeModificationCallback is not required anymore. modify_params_for_qat method is now a part of TensorRT Quantizer class
Fallback mechanism:
Class can be left "as is", with warning message indicating it is deprecated. Implementation is no-op (making no changes to config file)
Quantization params config section now has different structure
Fallback mechanism:
Probably can detect whether new/old config is given and fallback to TRT quantizer in the latter case
Trainer.ptq / Trainer.qat methods.
We can make them work as before, but this would limit us to using only a TRT quantizer.