Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 520 Bytes

230501 What Do Self-Supervised Vision Transformers Learn.md

File metadata and controls

7 lines (4 loc) · 520 Bytes

https://arxiv.org/abs/2305.00729

What Do Self-Supervised Vision Transformers Learn? (Namuk Park, Wonjae Kim, Byeongho Heo, Taekyung Kim, Sangdoo Yun)

contrastive learning vs masked image modeling. training objective가 시사하는 것처럼 contrastive learning은 global한 feature를 추출하는 것에 치중하는 경향이 있고 masked image modeling은 비교적 local한 feature를 추출하는 경향이 있군요. 결론적으로...둘 다 하면 좋습니다.

#self_supervised #contrastive_learning #mlm