Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 424 Bytes

210428 Twins.md

File metadata and controls

7 lines (4 loc) · 424 Bytes

https://arxiv.org/abs/2104.13840

Twins: Revisiting Spatial Attention Design in Vision Transformers (Xiangxiang Chu, Zhi Tian, Yuqing Wang, Bo Zhang, Haibing Ren, Xiaolin Wei, Huaxia Xia, Chunhua Shen)

local attention + window level global attention + positional encoding generator. positional encoding generator는 역시 꽤 흥미로운 접근인 것 같네요.

#vision_transformer #local_attention #positional_encoding