A Light weight deep learning model with with a web application to answer image-based questions with a non-generative approach for the VizWiz grand challenge 2023 by carefully curating the answer vocabulary and adding linear layer on top of Open AI's CLIP model as image and text encoder
machine-learning
deep-learning
vqa
clip
text-encoding
image-and-text
visual-question-answering
vqa-dataset
image-encoding
vizwiz
clip-model
vizwiz-vqa
visual-question-anwsering
open-ai-clip
vqa-2023
-
Updated
Jun 27, 2023 - Jupyter Notebook