Skip to content

Course Project for Spring 2022 EECS E6895 : Advanced Big Data and AI

Notifications You must be signed in to change notification settings

Sapphirine/video_caption_generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Video Caption Generation Using Deep Learning and NLP

Project Description

With the increasing volume of Multimedia operations in our everyday life, advancements in the digital space have become essential. Captioning is a natural language processing task that can revolutionize this segment. For instance, Web browsing greatly relies on finding the tags/titles of media. Image and Video Captioning would allow automatic annotation of the occurrences. This would not only facilitate the development of efficient search algorithms that search by contents of the media but would also help in the design of better recommendation systems for users. The goal of this project is to build Image Captioning and Video Captioning models. In the first part of the project, an Image Captioning Model is designed using Deep Learning and Encoder-Decoder architecture. In the second part, the scope of the model is expanded to Video Captioning. Flickr8k and TRECVID-VTT Data are used for the tasks. BLEU scores are calculated for the evaluation of the models. Greedy and Beam Search Algorithms are used for Real-Time Testing

Setup

Clone the repository : git clone https://github.com/Sapphirine/video_caption_generation.git

Video

The video explaining the project can be found here

System Overview

RESULTS

Video Predicted Caption
a boy sitting on a car seat of a car
a man is talking to a crowd
a boy in a bathtub
a cat in a room
a video of a video of a person

Actual v/s Predicted Captions

About

Course Project for Spring 2022 EECS E6895 : Advanced Big Data and AI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published