Musketeer (All for One, and One for All): A Generalist Vision-Language Model with Task Explanation Prompts (Zhaoyang Zhang, Yantao Shen, Kunyu Shi, Zhaowei Cai, Jun Fang, Siqi Deng, Hao Yang, Davide Modolo, Zhuowen Tu, Stefano Soatto)

vision-language를 사용한 all-in-one (grounding, captioning, detection, 등등...) vision 모델. 메인 아이디어는 과제에 대해 설명하는 프롬프트를 입력해 과제 간의 간섭을 줄인다는 것이네요.

#vision-language #multimodal #multitask

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

230511 Musketeer (All for One, and One for All).md

230511 Musketeer (All for One, and One for All).md

Files

230511 Musketeer (All for One, and One for All).md

Latest commit

History

230511 Musketeer (All for One, and One for All).md

File metadata and controls