Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 541 Bytes

230511 Musketeer (All for One, and One for All).md

File metadata and controls

7 lines (4 loc) · 541 Bytes

https://arxiv.org/abs/2305.07019

Musketeer (All for One, and One for All): A Generalist Vision-Language Model with Task Explanation Prompts (Zhaoyang Zhang, Yantao Shen, Kunyu Shi, Zhaowei Cai, Jun Fang, Siqi Deng, Hao Yang, Davide Modolo, Zhuowen Tu, Stefano Soatto)

vision-language를 사용한 all-in-one (grounding, captioning, detection, 등등...) vision 모델. 메인 아이디어는 과제에 대해 설명하는 프롬프트를 입력해 과제 간의 간섭을 줄인다는 것이네요.

#vision-language #multimodal #multitask