Skip to content

Releases: Xilinx/brevitas

Release v0.11.0

10 Oct 12:31
Compare
Choose a tag to compare

Breaking Changes

  • Remove ONNX QOp export (#917)
  • QuantTensor cannot have empty metadata fields (e.g., scale, bitwidth, etc.) (#819)
  • Bias quantization now requires the specification of bit-width (#839)
  • QuantLayers do not expose quant_metadata directly. This is delegated to the proxies (#883)
  • QuantDropout has been removed (#861)
  • QuantMaxPool has been removed (#858)

Highlights

  • Support for OCP/FNUZ FP8 quantization

    • Compatibility with QAT/PTQ, including all current PTQ algorithms implemented (GPTQ, LearnedRound, GPFQ, etc.)
    • Possibility to fully customize the minifloat configuration (i.e., select mantissa/exponent bit-width, exponent bias, etc.)
    • Support for ONNX QDQ export
  • Support for OCP MX Quantization

    • Compatibility with QAT/PTQ, including all current PTQ algorithms implemented (GPTQ, LearnedRound, GPFQ, etc.)
    • Possibility to fully customize the minifloat configuration (i.e., select mantissa/exponent bit-width, exponent bias, group size, etc.)
  • New QuantTensor supports:

    • FloatQuantTensor: supports OCP FP formats and general minifloat quantization
    • GroupwiseQuantTensor: supports for OCP MX formats and general groupwise int/minifloat quantization
  • Support for Channel splitting

  • Support for HQO optimization for zero point

  • Support for HQO optimization for scale (prototype)

  • Improved SDXL entrypoint under brevitas_examples

  • Improved LLM entrypoint under brevitas_examples

    • Compatibility with accelerate
  • Prototype support for torch.compile:

    • Check PR #1006 for an example on how to use it

What's Changed

For a more comprehensive list of changes and fix, check the list below:

Read more

Release v0.10.3

23 Jul 13:48
Compare
Choose a tag to compare

What's Changed

  • Backport: Fix (export/qonnx): Fixed symbolic kwargs order. (#988) by @nickfraser in #992
  • numpy version, onnx version and maximum setuptools version set

Full Changelog: v0.10.2...v0.10.3

Release v0.10.2

19 Feb 16:37
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.10.1...v0.10.2

Release v0.10.1

15 Feb 11:50
Compare
Choose a tag to compare

Highlights

  • A2Q+ support paper
  • A2Q+ examples with CIFAR10 and Super Resolution
  • Support for concatenation equalization for weights and activations
  • Support for GPFQ + A2Q L1 Norm bound
  • Possibility to explicitly export Q node for weights in QCDQ export
  • Support for float16 and bfloat16 for QCDQ export
  • Support for Dynamic Activation Quantization for ONNX QDQ export
  • Support for channel-splitting paper
  • (Beta) Better compatibility with Huggingface accelerate and optimum
  • (Beta) Improved support and testing for minifloat quantization

What's Changed

Full Changelog: v0.10.0...v0.10.1

A2Q+ CIFAR10 model release

12 Feb 18:17
c78f974
Compare
Choose a tag to compare
Pre-release

This release contains training code and pre-trained weights to demonstrate accumulator-aware quantization (A2Q) on an image classification task. Code is also provided to demonstrate Euclidean projection-based weight initialization (EP-init) as proposed in our paper "A2Q+: Improving Accumulator-Aware Weight Quantization".

Find the associated docs at https://github.com/Xilinx/brevitas/tree/a2q_cifar10_r1/src/brevitas_examples/imagenet_classification/a2q.

A2Q+ model release

30 Jan 19:00
17fb49e
Compare
Choose a tag to compare
A2Q+ model release Pre-release
Pre-release

A2Q+ Super Resolution Experiments with Brevitas

This release contains training code and pre-trained weights to demonstrate accumulator-aware quantization (A2Q+) as proposed in our paper "A2Q+: Improving Accumulator-Aware Weight Quantization" on a super resolution task.

Find the associated docs at https://github.com/Xilinx/brevitas/tree/super_res_r2/src/brevitas_examples/super_resolution.

Release v0.10.0

08 Dec 16:36
Compare
Choose a tag to compare

Highlights

  • Support for PyTorch up to version 2.1 .
  • Support for GPTQ PTQ algorithm.
  • Support for GPFQ PTQ algorithm.
  • Support for SmoothQuant / activation equalization PTQ algorithm.
  • Support for MSE based scale and zero-point for weights and activations.
  • Support for row-wise scaling at the input of QuantLinear.
  • Support for quantization of a slice of a weight tensor.
  • End-to-end support for learned rounding in ImageNet PTQ.
  • End-to-end example training scripts for A2Q (low precision accumulation) over superresolution.
  • Experimental support for minifloats (eXmY quantization).
  • Experimental LLM PTQ flow with support for weight-only and weight+activation quantization, together with GPTQ, AWQ and SmoothQuant.
  • Experimental Stable Diffusion PTQ flow with support for weight-only quantization.
  • Deprecated FINN ONNX export flow.
  • Update custom value_trace FX tracer to latest FX.
  • New custom variant of make_fx tracer with support for custom torch.library ops through @Wrap annotation.

What's Changed

Read more

A2Q model release

20 Sep 16:07
acf1f5d
Compare
Choose a tag to compare
A2Q model release Pre-release
Pre-release

Integer-Quantized Super Resolution Experiments with Brevitas

This release contains scripts demonstrating how to train integer-quantized super resolution models using Brevitas.
Code is also provided to demonstrate accumulator-aware quantization (A2Q) as proposed in our ICCV 2023 paper "A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance".

Find the associated docs at https://github.com/Xilinx/brevitas/tree/super_res_r1/src/brevitas_examples/super_resolution .

Release v0.9.1

28 Apr 16:57
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.9.0...v0.9.1

Release v0.9.0

21 Apr 17:50
Compare
Choose a tag to compare

Highlights

Overview of changes

Graph quantization

Quantized layers

  • Initial support for QuantMultiheadAttention #568
  • Breaking change: rename Quant(Adaptive)AvgPool to Trunc(Adaptive)AvgPool by @volcacius in #562

Quantizers

QuantTensor

PTQ

Export

CI, linting

FX

Examples

For the Full Changelog please check : v0.8.0...v0.9.0