https://arxiv.org/abs/2305.09781

SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification (Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Zeyu Wang, Rae Ying Yee Wong, Zhuoming Chen, Daiyaan Arfeen, Reyna Abhyankar, Zhihao Jia)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

230516 SpecInfer.md

230516 SpecInfer.md

Files

230516 SpecInfer.md

Latest commit

History

230516 SpecInfer.md

File metadata and controls