Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 500 Bytes

220413 METRO.md

File metadata and controls

7 lines (4 loc) · 500 Bytes

https://arxiv.org/abs/2204.06644

METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals (Payal Bajaj, Chenyan Xiong, Guolin Ke, Xiaodong Liu, Di He, Saurabh Tiwary, Tie-Yan Liu, Paul Bennett, Xia Song, Jianfeng Gao)

electra를 scaling한 모델이 나왔네요. electra + 더 작은 generator + relative pe + vocabulary 증가 + 시퀀스가 document 경계를 넘어가지 않도록 절단 + 토큰 예측 등을 집어넣었습니다.

#lm