Skip to content

Latest commit

 

History

History
212 lines (141 loc) · 7.76 KB

README_en.md

File metadata and controls

212 lines (141 loc) · 7.76 KB

简体中文 | English

Introduction

License python version support os

video-subtitle-extractor aims at extracting hard-coded subtitles and generating srt file. It includes the following implementations:

  • detect and extract subtitle frames (and keyframes using traditional graphic method)
  • detect subtitle areas (i.e., coordinates) (as well as scene text if you want) (using deep learning algorithms)
  • recognise the content of subtitles (i.e., converting graphic text into plain-text) (using deep learning algorithms)
  • filter non-subtitle text
  • remove duplicated subtitle line
  • generate srt file
  • multiple language support: Chinese/English, Traditional Chinese, Japanese, Korean, French, German, Russian, Spanish, Portuguese, Italian
  • multiple mode:
    • fast: high extraction speed while few subtitle missing (Recommended)
    • accurate: no subtitle missing while low extraction speed

Download

Features

  • You don't need to do any preprocessing to get an ideal result.
  • This is an offline project. You don't need to make any API call from Internet service provider in order to get results.
  • For Command Line Interface(CLI) version, you don't need to manually set the location of subtitle. This program will automatically detect the subtitle area for you.
  • GPU support is available. You can install CUDA and cuDNN to speed up the detection and recognition process and even get more accurate results.

demo

Provide your suggestions to improve this project in ISSUES

Demo

  • Graphic User Interface (GUI):

demo.gif

  • Command Line Interface (CLI):

Demo Video

Running Online

  • Google Colab Notebook with free GPU: Open In Colab

PS: can only run CLI version on Google Colab

Getting Started

1. Download and Install Anaconda

https://www.anaconda.com/products/individual#Downloads

2. Install Dependencies

make sure you have python 3.8+ installed. Create and activate a conda virtual environment, and install dependencies.

  • For Mac users and users who have CPU only:

    • Install dependencies:

      conda create -n videoEnv -f ./environment.yml
      conda activate videoEnv  
  • For users who have Nvidia graphic card: GPU version can achieve better accuracy

    • Install dependencies:

      conda create -n videoEnv -f ./environment_gpu.yml
      conda activate videoEnv  

3. Running the program

  • Run GUI version
python gui.py
  • Run CLI version
python ./backend/main.py

Q & A

1. Running Failure or Environment Problem

Solution: If you are using a nvidia ampere architecture graphics such as 3060/3070/3080, please use the latest PaddlePaddle version and CUDA 11.2, if you falied to install the GPU environment with Conda, please try manual installation:

  • Install CUDA 11.2 and cuDNN 8.1.1

    Linux
    (1) Download CUDA 11.2
    wget https://developer.download.nvidia.com/compute/cuda/11.2.0/local_installers/cuda_11.2.0_460.27.04_linux.run
    (2) Install CUDA 11.2
    sudo sh cuda_11.2.0_460.27.04_linux.run --override

    1. Input accept

    2. make sure CUDA Toolkit 11.2 is chosen (If you have already installed driver, do not select Driver)

    3. Add environment variables

    add the following content in ~/.bashrc

    # CUDA
    export PATH=/usr/local/cuda-11.2/bin${PATH:+:${PATH}}
    export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

    Make sure it works

    source ~/.bashrc
    (3) Download cuDNN 8.1.1

    cudnn-11.2-linux-x64-v8.1.1.33.tgz

    (4) Install cuDNN 8.1.1
     tar -zxvf cudnn-11.2-linux-x64-v8.1.1.33.tgz
     sudo cp ./cuda/include/* /usr/local/cuda-11.2/include/
     sudo cp ./cuda/lib64/* /usr/local/cuda-11.2/lib64/
     sudo chmod a+r /usr/local/cuda-11.2/lib64/*
     sudo chmod a+r /usr/local/cuda-11.2/include/*
    Windows
    (1) Download CUDA 11.2
    cuda_11.2.0_460.89_win10.exe
    (2) Install CUDA 11.2
    (3) Download cuDNN 8.1.1

    cudnn-11.2-windows-x64-v8.1.1.33.zip

    (4) Install cuDNN 8.1.1

    unzip "cudnn-11.2-windows-x64-v8.1.1.33.zip", then move all files in "bin, include, lib" in cuda directory to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\

  • Install paddlepaddle:

    conda install paddlepaddle-gpu==2.1.3 cudatoolkit=11.2 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/ -c conda-forge 

    If you installed cuda 10.2,please install cuDNN 7.6.5 instead of cuDNN v8.x

  • Install other dependencies:

    pip install -r requirements_gpu.txt

2. For Windows users, if you encounter errors related to "geos_c.dll"

    _lgeos = CDLL(os.path.join(sys.prefix, 'Library', 'bin', 'geos_c.dll'))
  File "C:\Users\Flavi\anaconda3\envs\subEnv\lib\ctypes\__init__.py", line 364, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] The specified module could not be found。

Solution:

  1. Uninstall Shapely
pip uninstall Shapely -y
  1. Reinstall Shapely via conda (make sure you have anaconda or miniconda installed)
conda install Shapely             

Community Support

Jetbrains All Products Pack

The IDE this project used is supported by Jetbrains

JetBrains Logo (Main) logo.