Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 287 Bytes

211018 NormFormer.md

File metadata and controls

7 lines (4 loc) · 287 Bytes

https://arxiv.org/abs/2110.09456

NormFormer: Improved Transformer Pretraining with Extra Normalization (Sam Shleifer, Jason Weston, Myle Ott)

normalization & scaling factor를 더 끼워넣은 트랜스포머. [[210917 Primer]]의 ReLU^2를 벌써 테스트해봤네요.

#transformer