简体中文 | English
video-subtitle-extractor aims at extracting hard-coded subtitles and generating srt file. It includes the following implementations:
- detect and extract subtitle frames (and keyframes using traditional graphic method)
- detect subtitle areas (i.e., coordinates) (as well as scene text if you want) (using deep learning algorithms)
- recognise the content of subtitles (i.e., converting graphic text into plain-text) (using deep learning algorithms)
- filter non-subtitle text
- remove duplicated subtitle line
- generate srt file
- multiple language support: Chinese/English, Traditional Chinese, Japanese, Korean, French, German, Russian, Spanish, Portuguese, Italian
- multiple mode:
- fast: high extraction speed while few subtitle missing (Recommended)
- accurate: no subtitle missing while low extraction speed
Download:
-
Windows executable(a little bit slow when initial run): vse.exe
-
Windows GPU version:vse_windows_GPU.7z
-
Windows CPU version:vse_windows_CPU.zip
-
MacOS:vse_macOS_CPU.dmg
- You don't need to do any preprocessing to get an ideal result.
- This is an offline project. You don't need to make any API call from Internet service provider in order to get results.
- For Command Line Interface(CLI) version, you don't need to manually set the location of subtitle. This program will automatically detect the subtitle area for you.
- GPU support is available. You can install CUDA and cuDNN to speed up the detection and recognition process and even get more accurate results.
Provide your suggestions to improve this project in ISSUES
- Graphic User Interface (GUI):
- Command Line Interface (CLI):
PS: can only run CLI version on Google Colab
https://www.anaconda.com/products/individual#Downloads
make sure you have python 3.8+ installed. Create and activate a conda virtual environment, and install dependencies.
-
For Mac users and users who have CPU only:
-
Install dependencies:
conda create -n videoEnv -f ./environment.yml
conda activate videoEnv
-
-
For users who have Nvidia graphic card: GPU version can achieve better accuracy
-
Install dependencies:
conda create -n videoEnv -f ./environment_gpu.yml
conda activate videoEnv
-
- Run GUI version
python gui.py
- Run CLI version
python ./backend/main.py
Solution: If you are using a nvidia ampere architecture graphics such as 3060/3070/3080, please use the latest PaddlePaddle version and CUDA 11.2, if you falied to install the GPU environment with Conda, please try manual installation:
-
Install CUDA 11.2 and cuDNN 8.1.1
Linux
wget https://developer.download.nvidia.com/compute/cuda/11.2.0/local_installers/cuda_11.2.0_460.27.04_linux.run
sudo sh cuda_11.2.0_460.27.04_linux.run --override
1. Input accept
2. make sure CUDA Toolkit 11.2 is chosen (If you have already installed driver, do not select Driver)
3. Add environment variables
add the following content in ~/.bashrc
# CUDA export PATH=/usr/local/cuda-11.2/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
Make sure it works
source ~/.bashrc
cudnn-11.2-linux-x64-v8.1.1.33.tgz
tar -zxvf cudnn-11.2-linux-x64-v8.1.1.33.tgz sudo cp ./cuda/include/* /usr/local/cuda-11.2/include/ sudo cp ./cuda/lib64/* /usr/local/cuda-11.2/lib64/ sudo chmod a+r /usr/local/cuda-11.2/lib64/* sudo chmod a+r /usr/local/cuda-11.2/include/*
Windows
cuda_11.2.0_460.89_win10.execudnn-11.2-windows-x64-v8.1.1.33.zip
unzip "cudnn-11.2-windows-x64-v8.1.1.33.zip", then move all files in "bin, include, lib" in cuda directory to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\
-
Install paddlepaddle:
conda install paddlepaddle-gpu==2.1.3 cudatoolkit=11.2 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/ -c conda-forge
If you installed cuda 10.2,please install cuDNN 7.6.5 instead of cuDNN v8.x
-
Install other dependencies:
pip install -r requirements_gpu.txt
_lgeos = CDLL(os.path.join(sys.prefix, 'Library', 'bin', 'geos_c.dll'))
File "C:\Users\Flavi\anaconda3\envs\subEnv\lib\ctypes\__init__.py", line 364, in __init__
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] The specified module could not be found。
Solution:
- Uninstall Shapely
pip uninstall Shapely -y
- Reinstall Shapely via conda (make sure you have anaconda or miniconda installed)
conda install Shapely
The IDE this project used is supported by Jetbrains