Skip to content

Latest commit

 

History

History
3 lines (2 loc) · 284 Bytes

230516 SpecInfer.md

File metadata and controls

3 lines (2 loc) · 284 Bytes

https://arxiv.org/abs/2305.09781

SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification (Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Zeyu Wang, Rae Ying Yee Wong, Zhuoming Chen, Daiyaan Arfeen, Reyna Abhyankar, Zhihao Jia)