Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 398 Bytes

220617 Unified-IO.md

File metadata and controls

7 lines (4 loc) · 398 Bytes

https://arxiv.org/abs/2206.08916

Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks (Jiasen Lu, Christopher Clark, Rowan Zellers, Roozbeh Mottaghi, Aniruddha Kembhavi)

VQ-VAE 토큰과 텍스트 토큰을 밀어넣고 seq2seq로 vision/text/vision-language/image generation 과제들을 전부 태클. 각 과제에 대한 파인튜닝 없이(!)

#vision-language #multitask