ml-papers/papers/2020/200423 Self-Attention Attribution.md at main · rosinality/ml-papers · GitHub

https://arxiv.org/abs/2004.11207

Self-Attention Attribution: Interpreting Information Interactions Inside Transformer (Yaru Hao, Li Dong, Furu Wei, Ke Xu)

트랜스포머에서 각 토큰들이 어떤 계층 구조로 결합해서 예측에 도달하는지를 추출하기 위한 방법. 이를 활용해 adversarial attack도 시도. 흥미로운 결과.

#bert #attention