#

multimodal-learning

Here are 242 public repositories matching this topic...

prasenjit52282 / BuStop

BuStop is a ML-based framework to automatically detect different stay-location types for intra-city public bus travels through multi-modal sensing.

machine-learning intelligent-transportation-systems multimodal-learning smartphone-sensor-data

Updated Nov 20, 2022
Jupyter Notebook

david-alvarez-rosa / referring-expression-comprehension

Exploring and Visualizing Referring Expression Comprehension (Bachelor's Thesis by David Álvarez Rosa)

machine-learning natural-language-processing computer-vision deep-learning artificial-intelligence multimodal-learning

Updated Jul 22, 2021
TeX

stoneMo / MGN

Official implementation for MGN

weakly-supervised-learning multimodal-learning audio-visual-learning audio-visual-parsing

Updated Dec 22, 2022
Python

YingWANGG / M2IB

Code for the paper Visual Explanations of Image–Text Representations via Multi-Modal Information Bottleneck Attribution

multimodal-learning interpretability-attribution

Updated Mar 25, 2024
Jupyter Notebook

Lukeasargen / Show-Attend-and-Tell-Pytorch-Lightning

Encoder-Decoder CNN-LSTM Model with an attention mechanism for image captioning. Trained using the Microsoft COCO Dataset.

text-generation pytorch lstm image-captioning show-attend-and-tell attention-mechanism encoder-decoder mscoco multimodal-learning attention-visualization pytorch-lightning

Updated Jul 27, 2022
Jupyter Notebook

yookyungkho / Multimodal-Entailment-pytorch

Pytorch Implementation of Multimodal Entailment baseline

nlp computer-vision deep-learning pytorch multimodal-learning multi-modal-learning multimodal-entailment

Updated May 24, 2022
Jupyter Notebook

sunoh-kim / PLRN

This repository contains an official PyTorch implementation of Position-aware Location Regression Network (PLRN) for temporal video grounding, which is presented in the paper Position-aware Location Regression Network for Temporal Video Grounding.

attention-mechanism multimodal-learning video-grounding

Updated Apr 16, 2022
Python

abhinav-neil / socratic-models

Socratic models for multimodal reasoning & image captioning

image-captioning clip multimodal-learning visual-question-answering gpt-3 chain-of-thought flan-t5 vision-language-learning

Updated Jun 4, 2023
Jupyter Notebook

anaezquerro / imx-evqa

Interactive Multimodal Explanations for Easy Visual Question Answering

natural-language-processing computer-vision multimodal-learning explainable-ai

Updated Mar 13, 2024
Jupyter Notebook

shantistewart / Emo-CLIM

Emo-CLIM: Emotion-Aligned Contrastive Learning Between Images and Music [ICASSP 2024]

music-information-retrieval multimodal-learning contrastive-learning

Updated Jan 15, 2024
Python

FIUPanther-JMolto98 / CalcWiz

Multimodal, intelligent LLM and RAG-powered math tutor capable of combining the power of NLP with CAS to produce answers that are mathematically-sound, hallucination-free, and easy to digest with step-by-step solutions delivered using natural language. Support LaTeX front-end rendering with libraries such as MathJax

machine-learning web mathematics wolfram-alpha wolfram artificial-intelligence computer-algebra-system wolfram-language wolfram-mathematica fullstack-development multimodal-learning sympy-library ai-assistant large-language-models generative-ai langchain llama-index retrieval-augmented-generation intelligent-note-taking

Updated Apr 24, 2024
JavaScript

MVCC_IGARSS

DFKI-Earth-And-Space-Applications / MVCC_IGARSS

Public repository of our IGARSS 2023 submission

remote-sensing agriculture-research data-fusion multimodal-learning multiview-learning multi-view-learning crop-classification multi-modal-learning datafusion multisensor-fusion croptypes crop-type-mapping

Updated Jul 27, 2023
Python

talipucar / talipucar.github.io_old

Showcases ongoing, and completed projects within various research themes.

domain-adaptation self-supervised multimodal-learning multimodal-deep-learning self-supervised-learning domain-translation

Updated Dec 28, 2022

minjoong507 / MPGN

[EMNLP 2022] Pytorch code for "Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval"

multimodal-learning video-retrieval video-grounding

Updated May 28, 2024
Python

stoneMo / OneAVM

Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)

multimodal-learning self-supervised-learning sound-source-localization audio-visual-correspondence audio-visual-learning sound-source-separation

Updated Jun 1, 2023

willxxy / awesome-mmps

Corpus of resources for multimodal machine learning with physiological signals

machine-learning deep-learning signal-processing physiological-signals multimodal-learning multimodal multimodal-deep-learning multimodal-data

Updated Jul 2, 2024

linxueya / ImageFusion_AD

multimodal-learning alzheimer-s-disease

Updated May 29, 2020
Python

ThreeSR / QA-Oriented-Pretraining

Official code of QA-oriented pretraining

pytorch multimodal-learning

Updated Apr 8, 2023
Jupyter Notebook

TalissaMoura / sounding_earth_with_vit

Code for project Using Self-Supervised Learning to classify aerial scenes audiovisuals with remote sensing data

remote-sensing multimodal-learning vision-transformer

Updated Feb 6, 2024
Jupyter Notebook

AshwinRJ / Face-Generation-from-Voice

VoiceGAN - Hallucinating Faces from Voices

generative-adversarial-network face-generation multimodal-learning

Updated Nov 21, 2019
Jupyter Notebook

Improve this page

Add a description, image, and links to the multimodal-learning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal-learning topic, visit your repo's landing page and select "manage topics."