Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 299 Bytes

210401 Towards General Purpose Vision Systems.md

File metadata and controls

7 lines (4 loc) · 299 Bytes

https://arxiv.org/abs/2104.00743

Towards General Purpose Vision Systems (Tanmay Gupta, Amita Kamath, Aniruddha Kembhavi, Derek Hoiem)

이미지와 텍스트 프롬프트를 넣으면 바운딩 박스도 잘라주고 텍스트로 설명도 뽑아주는 모델. 굉장히 멋짐.

#visual_grounding