Brevitas: neural network quantization in PyTorch
-
Updated
Dec 20, 2024 - Python
Brevitas: neural network quantization in PyTorch
More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam,dcn and so on), and tensorrt
Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.
mi-optimize is a versatile tool designed for the quantization and evaluation of large language models (LLMs). The library's seamless integration of various quantization methods and evaluation techniques empowers users to customize their approaches according to specific requirements and constraints, providing a high level of flexibility.
[ICML 2024] Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
Generating tensorrt model using onnx
quantization example for pqt & qat
inference with the structured sparsity and quantization
Post post-training-quantization (PTQ) method for improving LLMs. Unofficial implementation of https://arxiv.org/abs/2309.02784
Build AI model to classify beverages for blind individuals
EfficientNetV2 (Efficientnetv2-b2) and quantization int8 and fp32 (QAT and PTQ) on CK+ dataset . fine-tuning, augmentation, solving imbalanced dataset, etc.
All useful sample codes of tensorrt models using onnx
Quantization of Models : Post-Training Quantization(PTQ) and Quantize Aware Training(QAT)
Add a description, image, and links to the ptq topic page so that developers can more easily learn about it.
To associate your repository with the ptq topic, visit your repo's landing page and select "manage topics."