[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
-
Updated
Aug 13, 2024 - Python
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
OpenSSA: Small Specialist Agents based on Domain-Aware Neurosymbolic Agent (DANA) architecture for industrial problem-solving
[CVPR 2024] Asymmetric Masked Distillation for Pre-Training Small Foundation Models
This repository features a custom-built decoder-only language model (LLM) with a total of 37 million parameters 🔥. I train the model to be able to ask question from a given context
Overview of self-supervised learning of tiny models, including distillation-based methods (aks. self-supervised distillation) and non-distillation methods.
Code for "On the Surprising Efficacy of Distillation as an Alternative to Pre-Training Small Models"
Help us define the Pareto front of small models for MNIST classification. Frugal AI.
Phi-3-Vision model test - running locally
Add a description, image, and links to the small-models topic page so that developers can more easily learn about it.
To associate your repository with the small-models topic, visit your repo's landing page and select "manage topics."