Image Captioning

this is a deep learning project, which has used the concept of AutoEncoders or Popularly known as seq2seq models.

Description

the deep learning model takes input as an image and then tries to output a sentence which describes the image, like below

will output a red bus parked on a city street

which is not exactly correct, but model was able to identify that object is bus, and its parked. Currently the model is in quite nascent stage, and with better datasets and architectures better results can be found.

Architecture

In encoder, simple CNN layers are used. This will learn the complex representation of the image and objects, and store it in latent variable, or we say bottleneck layer.

In decoder, LSTM(Long Short Term Memory) is used, this will output next word on the basis of last word outputed. This is RNN architecture.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Image_Captioning.ipynb		Image_Captioning.ipynb
Image_Captioning_new.ipynb		Image_Captioning_new.ipynb
README.md		README.md
electric_bus.jpg		electric_bus.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Captioning

Description

Architecture

About

Releases

Packages

Languages

priyam314/Deep-Learning

Folders and files

Latest commit

History

Repository files navigation

Image Captioning

Description

Architecture

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages