-
Updated
May 26, 2020 - Python
vqa-dataset
Here are 36 public repositories matching this topic...
The Visual Question Answering (VQA) project features a model with a simple GUI that handles both images and videos. It uses OpenAI's CLIP for encoding images and questions and GPT-2 for decoding embeddings to answer questions based on the VQA Version 2 dataset, which includes 265,016 images with multiple questions and answers.
-
Updated
Jun 24, 2024 - Jupyter Notebook
-
Updated
Dec 15, 2018 - Python
Egunean Behin Visual Question Answering Dataset
-
Updated
Mar 31, 2022 - Jupyter Notebook
Part of our final year project work involving complex NLP tasks along with experimentation on various datasets and different LLMs
-
Updated
Jan 12, 2024 - HTML
Deep Learning Web app that responds to any question about an image.
-
Updated
May 12, 2020 - Python
-
Updated
Apr 21, 2017 - Python
Visual Question Answering (VQA)
-
Updated
Oct 2, 2021 - Python
Medical Report Generation And VQA (Adapting XrayGPT to Any Modality)
-
Updated
Jun 24, 2024 - Python
VQA Challenge - hosted on Hasura using Flask
-
Updated
Apr 30, 2018 - Python
Visual Question Answer (VQA) software! Powered by Flask, this project seamlessly combines images and questions to generate accurate responses. Explore the world of interactive visual understanding with ease.
-
Updated
Jun 2, 2023 - HTML
Streamlit app for demonstrating multi-modal(vision+language) modelling in Pytorch.
-
Updated
Aug 22, 2022 - Python
MAVERICS (Manually-vAlidated Vq^2a Examples fRom Image-Caption datasetS) is a suite of test-only benchmarks for visual question answering (VQA).
-
Updated
Feb 18, 2023
Grid features extraction for ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"
-
Updated
Oct 10, 2021 - Python
SSG-VQA is a Visual Question Answering (VQA) dataset on laparoscopic videos providing diverse, geometrically grounded, unbiased and surgical action-oriented queries generated using scene graphs.
-
Updated
Aug 29, 2024 - Python
CLEVR3D Dataset: Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation
-
Updated
Feb 2, 2024 - Python
B.Sc. Final Project: LXMERT Model Compression for Visual Question Answering.
-
Updated
Nov 22, 2023 - Python
This repo implements attention networks for visual question answering
-
Updated
Dec 23, 2018 - Python
How well do the GPT-4V, Gemini Pro Vision, and Claude 3 Opus models perform zero-shot vision tasks on data structures?
-
Updated
Jun 13, 2024 - Python
Improve this page
Add a description, image, and links to the vqa-dataset topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the vqa-dataset topic, visit your repo's landing page and select "manage topics."