Video2Text

An Encoder-Decoder Model for Sequence-to-Sequence learning: Video to Text

Examples

Video	Text
	a man is driving down a road
	a man is playing a guitar
	a woman is cooking eggs in a bowl
	a man eats pasta
	a woman is slicing tofu
	a person is mixing a tortilla
	a group of people are dancing
	a person is holding a dog

Dataset

MSVD Dataset (Download)

1450 videos for training, 100 videos for testing

The input features are extracted by VGG(pretrained on the ImageNet).

Model Structures

Training Model

Inference Model

Encoder

How to use the code

video2text.py

usage: video2text.py [-h] --uid UID [--train_path TRAIN_PATH]
                     [--test_path TEST_PATH] [--learning_rate LEARNING_RATE]
                     [--batch_size BATCH_SIZE] [--epoch EPOCH] [--test]

Video to Text Model

optional arguments:
  -h, --help            show this help message and exit
  --uid UID             training uid
  --train_path TRAIN_PATH
                        training data path
  --test_path TEST_PATH
                        test data path
  --learning_rate LEARNING_RATE
                        learning rate for training
  --batch_size BATCH_SIZE
                        batch size for training
  --epoch EPOCH         epochs for training
  --test                use this flag for testing

Split the pre-extracted features of videos into training and testing directories. For training you may want to preprocess the data.

For testing, you should use the --test flag, and here is a sample script to generate the testing results!

python video2text.py --uid best --test

This generates the video-to-text output at test_ouput.txt, and the average bleu score is 0.69009423.

For more information, check out the report.

References

Keras Blog: A ten-minute introduction to sequence-to-sequence learning in Keras

ADLxMLDS 2017 Fall Assignment 2

LICENSE

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
images		images
models		models
preload_data		preload_data
.gitignore		.gitignore
LICENSE		LICENSE
readme.md		readme.md
report.pdf		report.pdf
video2text.py		video2text.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video2Text

Examples

Dataset

Model Structures

Training Model

Inference Model

Encoder

Encoder

How to use the code

video2text.py

References

LICENSE

About

Releases

Packages

Languages

License

alvinbhou/Video2Text

Folders and files

Latest commit

History

Repository files navigation

Video2Text

Examples

Dataset

Model Structures

Training Model

Inference Model

Encoder

Encoder

How to use the code

video2text.py

References

LICENSE

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages