https://arxiv.org/abs/2208.14580
Efficient Sparsely Activated Transformers (Salar Latifi, Saurav Muralidharan, Michael Garland)
moe 레이어를 포함해서 latency를 목표한 transformer search.
#nas #moe
https://arxiv.org/abs/2208.14580
Efficient Sparsely Activated Transformers (Salar Latifi, Saurav Muralidharan, Michael Garland)
moe 레이어를 포함해서 latency를 목표한 transformer search.
#nas #moe