Skip to content

alpargun/learn-from-youtube

Repository files navigation

Learn From YouTube Videos

This repository aims to create on-demand video datasets from YouTube, which can then be used to train video networks through self-supervised learning.

A .csv file with YouTube video IDs is enough to automatically download and preprocess all videos. Additionally, all videos from a single channel/user can be downloaded automatically.

Uses my PyTube fork with bug fixes and enhancements to download the videos.

Progress so far

  • Can download Youtube videos from a .csv file with URLs.
  • Can clip videos as $n$ frames with $m$ step between frames to prepare input for video models.
  • Can sample multiple clips from the same video.
  • Implements custom video dataset class and dataloaders. Visualizes clips from batches for debugging.
  • Implements video transforms such as cropping, resizing, normalizaton.
  • Can run inference with a downloaded 3D-ResNet50 model on downloaded K400 test set videos.
  • Can run YOLOv8 inference on prepared dataset and visualizes the detections.
  • Extracts important keywords from video description and title.

TODO

  • Save youtube links
  • Implement video downloader
  • Prepare video dataset class
  • Preprocessing/Transform for videos (e.g. resolution, clipping)
  • Test inference of downloaded video networks
  • Download title and description from videos and extract keywords to label videos
  • Test inference of downloaded image networks (YOLO)
  • Make YOLO predicted bounding boxes interactive for self-labeling/label correction
  • Implement own network
  • SSL training
  • Evaluate on downstream CV tasks
  • Evaluate on downstream RL tasks (Atari etc.) to test interacting with the world

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages