Skip to content

vi-hna-ja/Image-Caption-Generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Image Caption Generator

This project is done using TensorFlow and Keras API.

  • The dataset can be downloaded from here.
  • The architecture for this project follows an Encoder-Decoder approach.
  • The Encoder model is a Convolutional Neural Network because the input is an image.
  • It's output is then fed into the Decoder model which is a LSTM.
  • The image preprocessing is reshaping the image. Transfer Learning is utilized to generate better predictions.
  • The text preprocessing consists of cleaning the data by removing digits, converting to lowercase and then generating the mappings for each word to unqiue number and its reverse.
  • Vocabulary is then created out of the words by including only the words that fall below the certain threshold. This discarded all the most frequently words like a, an, the.
  • Then the word embedding matrix between the vocabulary is calculated by using the glove 6B 50D vector.

Generated Predictions

About

Image Caption Generator using TensorFlow and Keras

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages