Starred repositories
A collection of memory efficient attention operators implemented in the Triton language.
FlagGems is an operator library for large language models implemented in Triton Language.
Interactively inspect module inputs, outputs, parameters, and gradients.
Building a quick conversation-based search demo with Lepton AI.
Video+code lecture on building nanoGPT from scratch
Graphic notes on Gilbert Strang's "Linear Algebra for Everyone", 线性代数的艺术中文版, 欢迎PR.
Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
AirLLM 70B inference with single 4GB GPU
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
Train a 1B LLM with 1T tokens from scratch by personal
仅需Python基础,从0构建大语言模型;从0逐步构建GLM4\Llama3\RWKV6, 深入理解大模型原理
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
GPGPU processor supporting RISCV-V extension, developed with Chisel HDL
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
llama3 implementation one matrix multiplication at a time
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
An awesome & curated list of best LLMOps tools for developers
Robust Speech Recognition via Large-Scale Weak Supervision
The official Python client for the Huggingface Hub.
ModelScope: bring the notion of Model-as-a-Service to life.
the official code for "ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases"
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.