FNet: Mixing Tokens with Fourier Transforms (James Lee-Thorp, Joshua Ainslie, Ilya Eckstein, Santiago Ontanon)

self attention 대신 푸리에 변환을 끼워넣어서 frequency space로 전환해 토큰들을 섞기. 논문에서 지적하는 것처럼 conv를 끼워넣은 것과 비슷한 효과로 이어질 수 있겠네요. [[210510 Are Pre-trained Convolutions Better than Pre-trained Transformers]] 성능은 self attention에 못 미치긴 합니다.

long range arean도 슬슬 벤치마크를 업데이트해서 업데이트된 버전이 나와야겠네요.

#efficient_attention #fourier #transformer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

210509 FNet.md

210509 FNet.md

Files

210509 FNet.md

Latest commit

History

210509 FNet.md

File metadata and controls