multimodal-deep-learning

Here are 404 public repositories matching this topic...

salesforce / LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

deep-learning salesforce image-captioning deep-learning-library vision-framework vision-and-language multimodal-deep-learning multimodal-datasets vision-language-transformer vision-language-pretraining visual-question-anwsering

Updated Nov 18, 2024
Jupyter Notebook

Yutong-Zhou-cv / Awesome-Text-to-Image

Star

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

survey generative-adversarial-network image-manipulation image-generation text-to-image image-synthesis multimodal multimodal-deep-learning awseome-list text-to-face

Updated Nov 7, 2024

AI4Finance-Foundation / FinRobot

Star

FinRobot: An Open-Source AI Agent Platform for Financial Analysis using LLMs 🚀 🚀 🚀

finance multimodal-deep-learning robo-advisor large-language-models prompt-engineering chatgpt fingpt aiagent

Updated Nov 17, 2024
Jupyter Notebook

kyegomez / BitNet

Sponsor

Star

Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch

machine-learning deep-neural-networks artificial-intelligence deeplearning multimodal multimodal-deep-learning gpt4

Updated Nov 11, 2024
Python

KimMeen / Time-LLM

Star

[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"

machine-learning deep-learning time-series language-model time-series-analysis time-series-forecast time-series-forecasting multimodal-deep-learning cross-modality multimodal-time-series cross-modal-learning prompt-tuning large-language-models

Updated Nov 3, 2024
Python

AlibabaResearch / AdvancedLiterateMachinery

Star

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.

Updated Dec 23, 2024
C++

jrzaurin / pytorch-widedeep

Star

A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch

python deep-learning text images tabular-data pytorch pytorch-cv multimodal-deep-learning pytorch-nlp pytorch-transformers model-hub pytorch-tabular-data

Updated Nov 6, 2024
Python

DWCTOD / CVPR2024-Papers-with-Code-Demo

Star

收集 CVPR 最新的成果，包括论文、代码和demo视频等，欢迎大家推荐！Collect the latest CVPR (Conference on Computer Vision and Pattern Recognition) results, including papers, code, and demo videos, etc., and welcome recommendations from everyone!

computer-vision segmentation object-detection cvpr multimodal-deep-learning cvpr2021 cvpr2022 llm cvpr2023 segment-anything cvpr2024

Updated Apr 25, 2024

yuewang-cuhk / awesome-vision-language-pretraining-papers

Star

Recent Advances in Vision and Language PreTrained Models (VL-PTMs)

bert vision-and-language multimodal-deep-learning pretraining vl-ptms

Updated Aug 19, 2022

TheShadow29 / awesome-grounding

Star

awesome grounding: A curated list of research papers in visual grounding

natural-language-processing computer-vision paper awesome-list arxiv papers video-understanding captioning-images captioning-videos phrase-grounding language-grounding multimodal-deep-learning grounding visual-grounding embodied-agent video-grounding image-grounding paper-roadmap

Updated Apr 9, 2023

declare-lab / multimodal-deep-learning

Star

This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

multimodal-interactions multimodal-learning multimodal-sentiment-analysis multimodal-deep-learning