VQA

The project aims at multi-layered understanding of pictures to allow a multi-perspective study and hence engender a visual question answering system.

About

Visual Question Answering uses various machine learning techniques to answer questions about images. It is a two-part process. The first part requires us to analyze a given image and find out attributes. These attributes are stored as a knowledge graph. The figure below shows how an image is passed through various modules and a knowledge graph is generated.

The second part involves creating a descriptive comprehension from the knowledge graph using basic English syntax. This can be seen in the paragraph_generator module. Using DeepPavlov, we then run a pre-trained model to determine answers to the questions asked by users.

Results

Here are some examples of what our system is capable of -

input	question	answer
	How many people are there?	4
	Where is this image taken?	Corral
	What color is the person wearing?	Orange
	What is the man doing?	Throwing a frisbee in the air

Structure of the project

The data directory contains pre-trained models and weights;
The modules directory contains files for individual detection and classification tasks;
The utils directory contains utilty and helper functions.
The DeepRNN directory contains scripts required for image_captioning from DeepRNN.

Setup

Python 3 is required.

Clone the repository -

git lfs clone --recurse-submodules https://github.com/shubham1172/VQA.git

Install the dependencies -

pip install -r requirements.txt

Usage

python3 run.py --path path/to/image

Reference

Image captioning : DeepRNN/image_captioning

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
DeepRNN		DeepRNN
data		data
docs/images		docs/images
modules		modules
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
knowledge_graph.py		knowledge_graph.py
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VQA

About

Results

Structure of the project

Setup

Usage

Reference

About

Releases

Packages

Contributors 2

Languages

License

anujanegi/VQA

Folders and files

Latest commit

History

Repository files navigation

VQA

About

Results

Structure of the project

Setup

Usage

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages