Efficient Large Scale Language Modeling with Mixtures of Experts (Mikel Artetxe, Shruti Bhosale, Naman Goyal, Todor Mihaylov, Myle Ott, Sam Shleifer, Xi Victoria Lin, Jingfei Du, Srinivasan Iyer, Ramakanth Pasunuru, Giri Anantharaman, Xian Li, Shuohui Chen, Halil Akin, Mandeep Baines, Louis Martin, Xing Zhou, Punit Singh Koura, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Mona Diab, Zornitsa Kozareva, Ves Stoyanov)

다음 llm은 moe한 것으로.

#lm #mixture_of_experts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

211220 Efficient Large Scale Language Modeling with Mixtures of Experts.md

211220 Efficient Large Scale Language Modeling with Mixtures of Experts.md

Files

211220 Efficient Large Scale Language Modeling with Mixtures of Experts.md

Latest commit

History

211220 Efficient Large Scale Language Modeling with Mixtures of Experts.md

File metadata and controls