Skip to content

sgl-project/sgl-learning-materials

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Materials for learning SGLang

Blog

LMSYS Org

[2024-12-04] SGLang v0.4: Zero-Overhead Batch Scheduler, Cache-Aware Load Balancer, Faster Structured Outputs

[2024-09-04] SGLang v0.3 Release: 7x Faster DeepSeek MLA, 1.5x Faster torch.compile, Multi-Image/Video LLaVA-OneVision

[2024-07-25] Achieving Faster Open-Source Llama3 Serving with SGLang Runtime (vs. TensorRT-LLM, vLLM)

[2024-02-05] Fast JSON Decoding for Local LLMs with Compressed Finite State Machine

[2024-01-17] Fast and Expressive LLM Inference with RadixAttention and SGLang

AMD

[2024-11-13] SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD GPUs

Slides

CAMEL-AI Hackathon: Mastering Multi-Agent Systems

[2024-12-21] SGLang v0.4 Optimization

GPU MODE

[2024-11-10] SGLang Performance Optimization

The first LMSYS online meetup: Efficient LLM Deployment and Serving

[2024-10-16] SGLang Overview & CPU Overhead Hiding

[2024-10-16] Faster Constrained Decoding

[2024-10-16] SGLang DeepSeek MLA

[2024-10-16] Universal LLM deployment and low-latency serving in MLC LLM

[2024-10-16] XGrammar: Flexible And Efficient Structured Generation Engine for Large Language Models

[2024-10-16] Review of the first LMSYS online meetup: Efficient LLM Deployment and Serving

AMD Advancing AI 2024

[2024-10-10] Efficient LLM Inference with SGLang

SGLang Biweekly Meeting

[2024-11-30] Update Weights From Distributed

[2024-11-16] SGLang Router and Side-Channel KV Cache Attack

[2024-11-02] Quantization on AMD

[2024-10-05] SGLang Double Sparsity

[2024-09-21] SGLang DeepSeek MLA

Other

SGLang v0.2: Faster Interface and Runtime for LLM Inference

Videos

Welcome to follow our YouTube channel.

GPU MODE

[2024-11-10] SGLang Performance Optimization

The first LMSYS online meetup

[2024-10-16] The First SGLang Online Meetup

AMD Advancing AI 2024

[2024-10-10] Efficient LLM Inference with SGLang

SGLang Biweekly Meeting

[2024-12-14] SGLang Developer Sync 20241214

[2024-11-30] SGLang Developer Sync 20241130

[2024-11-16] SGLang Developer Sync 20241116

[2024-11-03] SGLang Developer Sync 20241103

[2024-10-19] SGLang Developer Sync 20241019

[2024-10-05] SGLang Developer Sync 20241005

[2024-09-21] SGLang Developer Sync 20240921

Paper

[NeurIPS 24] SGLang: Efficient Execution of Structured Language Model Programs

Documentaion

SGLang Documentation

About

Materials for learning SGLang

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published