多模态数据融合处理 #280

Answered by Sm1les

FreyDeng asked this question in 问题求助

FreyDeng
Oct 15, 2024

希望能出一个多模态数据的融合处理教程，比如对文本、图像、音频、视频等数据进行特征提取，然后使用各类注意力机制或transform架构进行特征融合处理，最后完成分类或回归等相关任务，谢谢。

Answered by Sm1les

这其实就是多模态，可以直接学习qwen-vl的微调：https://github.com/datawhalechina/self-llm/blob/master/models/Qwen2-VL/06-Qwen2-VL-2B-Instruct%20Lora%20%E5%BE%AE%E8%B0%83%E6%A1%88%E4%BE%8B%20-%20LaTexOCR.md

View full answer

Replies: 2 comments

Volta-lemon
Nov 2, 2024

+1

0 replies

Sm1les
Dec 11, 2024
Maintainer

这其实就是多模态，可以直接学习qwen-vl的微调：https://github.com/datawhalechina/self-llm/blob/master/models/Qwen2-VL/06-Qwen2-VL-2B-Instruct%20Lora%20%E5%BE%AE%E8%B0%83%E6%A1%88%E4%BE%8B%20-%20LaTexOCR.md

0 replies

Answer selected by Sm1les

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment