Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 317 Bytes

200424 Lite Transformer with Long-Short Range Attention.md

File metadata and controls

7 lines (4 loc) · 317 Bytes

https://arxiv.org/abs/2004.11886

Lite Transformer with Long-Short Range Attention (Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han)

트랜스포머 깎기. bottleneck을 없애고 conv 레이어를 추가. 트랜스포머 깎기는 비슷한 방식으로 수렴하는 듯 싶기도.

#transformer #lightweight