Interactive Multimodal eXplanations for Easy Visual Question Answering (IMX-EVQA)

Hi👋! This is a Pytorch implementation of the explainability system proposed in A Study on Multimodal and Interactive Explanations for Visual Question Answering (Alipour et al., 2020). We adapted their experiments to obtain interactive multimodal explanations based on text and image in the Easy VQA dataset.

Our adaptation is shown in below. We used a Conditioned U-Net to leverage visual with textual information from the question words. The output is a mask that is applied to the original input image to feed a convolutional classifier with text conditioning.

Installation: Check the libraries requirements.txt to train and deploy our system.

pip3 install -r requirements.txt

Loading data

Before training or predicting with our system, it is required to download the Easy VQA dataset. The folder easy_vqa/data must be downloaded at the root of this repository and renamed as easy-vqa. Optionally, run the following terminal commands:

!wget https://github.com/vzhou842/easy-VQA/archive/refs/heads/master.zip
!unzip master.zip
!mv easy-VQA-master/easy_vqa/data/ easy-vqa
!rm -r easy-VQA-master/
!rm master.zip

The folder structure should look like this:

easy-vqa/  
    test/
    train/
    answers.txt
modules/
utils/
model.py
system.py

Training

Before deploying the system it is required to train the neural network. We suggest to directly run the system.py script to train the model (it is possible to configure some parameters such as the batch size or number of epochs). With the default configuration the model should reach ~90% in F-score. At the end of the training, the system parameters should be stored at results/ folder.

Deployment

See the prepared demo.ipynb for documentated examples of how to work with interactive multimodal explanations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Interactive Multimodal eXplanations for Easy Visual Question Answering (IMX-EVQA)

Loading data

Training

Deployment

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
modules		modules
results		results
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.ipynb		demo.ipynb
model.py		model.py
model.svg		model.svg
requirements.txt		requirements.txt
system.py		system.py

License

anaezquerro/imx-evqa

Folders and files

Latest commit

History

Repository files navigation

Interactive Multimodal eXplanations for Easy Visual Question Answering (IMX-EVQA)

Loading data

Training

Deployment

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages