https://arxiv.org/abs/2302.00923
Multimodal Chain-of-Thought Reasoning in Language Models (Zhuosheng Zhang, Aston Zhang, Mu Li, Hai Zhao, George Karypis, Alex Smola)
multimodal reasoning을 할 수 있도록 multimodal chain of thought가 가능한 모델을 구성했다는 느낌이네요.
#multimodal #vision-language