From c9ce752fd1a4c5a95a330e6736eb5e7e76947b8f Mon Sep 17 00:00:00 2001 From: jameson512 <2867557054@qq.com> Date: Sun, 5 May 2024 18:39:12 +0800 Subject: [PATCH] fix: bugs and update --- .gitignore | 1 + README.md | 516 +++------------ README_EN.md | 599 ++++-------------- README_en_v2.md | 423 ++++--------- requirements-linux-gpu.txt | 163 ----- requirements-win-gpu.txt | 163 ----- ...rements-cpu-or-mac.txt => requirements.txt | 25 +- run2.bat | 3 - sp.py | 3 +- version.json | 4 +- videotrans/00chatgpt.txt | 21 - 11 files changed, 384 insertions(+), 1537 deletions(-) delete mode 100644 requirements-linux-gpu.txt delete mode 100644 requirements-win-gpu.txt rename requirements-cpu-or-mac.txt => requirements.txt (91%) delete mode 100644 run2.bat delete mode 100644 videotrans/00chatgpt.txt diff --git a/.gitignore b/.gitignore index a21d2b3a..cc02ae24 100644 --- a/.gitignore +++ b/.gitignore @@ -8,6 +8,7 @@ models/*.pt models/models--Systran--faster-whisper-base dev venv +venv.bak dist source build diff --git a/README.md b/README.md index f10479d9..a50bd6fe 100644 --- a/README.md +++ b/README.md @@ -42,7 +42,7 @@ https://github.com/jianchang512/pyvideotrans/assets/3378335/3811217a-26c8-4084-ba24-7a95d2e13d58 -# 下载预打包版本(仅win10/win11可用,MacOS/Linux系统使用源码部署) +# 预打包版本(仅win10/win11可用,MacOS/Linux系统使用源码部署) > 使用pyinstaller打包,未做免杀和签名,杀软可能报毒,请加入信任名单或使用源码部署 @@ -53,517 +53,195 @@ https://github.com/jianchang512/pyvideotrans/assets/3378335/3811217a-26c8-4084-b 4. 注意:必须解压后使用,不可直接压缩包内双击使用,也不可解压后移动sp.exe文件到其他位置 -# 源码部署/MacOS Linux系统可源码部署 +# MacOS源码部署 -1. 安装好 python 3.10 环境,安装好 git -2. 找个不含空格和中文的文件夹,Linux和Mac下从终端打开该文件夹。Window下地址栏中输入 `cmd`回车。 -3. 终端中执行命令 `git clone https://github.com/jianchang512/pyvideotrans` -4. 继续执行命令 `cd pyvideotrans` -5. 继续执行 `python -m venv venv`,如果你的 python 指向的不行 python3,需要改成 `python3 -m venv venv`,下同 -6. Linux和Mac继续执行命令 `source ./venv/bin/activate`,Window执行命令 `.\venv\scripts\activate` -7. Linux下如果要使用CUDA加速,执行 `pip install -r requirements-linux-gpu.txt --no-deps`,如果不需要CUDA加速,执行 `pip install -r requirements-cpu-or-mac.txt --no-deps` -8. Mac执行 `pip install -r requirements-cpu-or-mac.txt --no-deps` -9. Window下如果要使用CUDA加速,执行 `pip install -r requirements-win-gpu.txt --no-deps`,如果不需要CUDA加速,执行 `pip install -r requirements-cpu-or-mac.txt --no-deps` - -10. windows 和 linux 如果要启用cuda加速,必须有英伟达显卡,并且配置好了CUDA11.8+环境,具体安装见 [CUDA加速支持](https://github.com/jianchang512/pyvideotrans?tab=readme-ov-file#cuda-%E5%8A%A0%E9%80%9F%E6%94%AF%E6%8C%81) - -11. Linux 如果要使用 CUDA 加速,还需要额外执行安装 `pip install nvidia-cublas-cu11 nvidia-cudnn-cu11` - -12. win下解压 ffmpeg.zip 到根目录下 (ffmpeg.exe文件),Linux和Mac 请自行安装 ffmpeg,具体方法可搜索 “MacOS 安装 ffmpeg” 或 “Linux 安装ffmpeg” - -13. `python sp.py` 打开软件界面 - -14. Ubuntu 下可能还需要安装 Libxcb 库,安装命令 - - ``` - sudo apt-get update - sudo apt-get install libxcb-cursor0 - ``` - -15. Mac下需要执行 `brew install libsndfile` 安装libsndfile - -[Mac下详细部署方案](https://pyvideotrans.com/mac.html) - - -# 使用方法 / [更多文档请查看 pyvideotrans.com](https://pyvideotrans.com/guide.html) - -1. 选择视频:点击选择mp4/avi/mov/mkv/mpeg视频,可选择多个视频; - -2. 保存到..:如果不选择,则默认生成在同目录下的 `_video_out`,同时在该目录下的srt文件夹中将创建原语言和目标语言的两种字幕文件 - -3. 翻译渠道:可选 microsoft|google|baidu|tencent|freeChatGPT|chatGPT|Azure|Gemini|DeepL|DeepLX|OTT 翻译渠道 - -4. 代理地址:如果你所在地区无法直接访问 google/chatGPT,需要在软件界面 网络代理 中设置代理,比如若使用 v2ray ,则填写 `http://127.0.0.1:10809`,若clash,则填写 `http://127.0.0.1:7890`. 如果你修改了默认端口或使用的其他代理软件,则按需填写 - -5. 原始语言:选择待翻译视频里的语言种类 - -6. 目标语言:选择希望翻译到的语言种类 - -7. TTS和配音角色:选择翻译目标语言后,可从配音选项中,选择配音角色; - - 硬字幕: - 是指始终显示字幕,不可隐藏,如果希望网页中播放时也有字幕,请选择硬字幕嵌入,硬字幕时可通过videotrans/set.ini 中 fontsize设置字体大小 - - 硬字幕(双): - 将上下两排分别显示目标语言字幕和原始语言字幕 - - 软字幕: - 如果播放器支持字幕管理,可显示或者隐藏字幕,该方式网页中播放时不会显示字幕,某些国产播放器可能不支持,需要将生成的视频同名srt文件和视频放在一个目录下才会显示 - - 软字幕(双): - 将嵌入2种语言的字幕,可通过播放器的字幕显示/隐藏功能来切换不同语言字幕 - - -8. 语音识别模型: 选择 tiny/base/small/medium/large-v2/large-v3, 识别效果越来越好,但识别速度越来越慢,所需内存越来越大,内置base模型,其他模型请单独下载后,解压放到 `当前软件目录/models`目录下.如果GPU显存低于4G,不要使用 large-v3 - - 整体识别:由模型自动对整个音频断句处理,多大的视频请勿选择整体识别,避免显存不足闪退 - - 预先分割:适合很大的视频,事先切成1分钟的小片段逐次识别和断句 - - 均等分割:按照固定秒数均等切割,每条字幕时长相等,时长由set.ini中interval_split控制 - - [全部模型下载地址](https://github.com/jianchang512/stt/releases/tag/0.0) - - **特别注意** - - faster模型:如果下载的是faster模型,下载后解压,将压缩包内的"models--Systran--faster-whisper-xx"文件夹复制到models目录内,解压复制后 models 目录下文件夹列表如下 - ![](https://github.com/jianchang512/stt/assets/3378335/5c972f7b-b0bf-4732-a6f1-253f42c45087) - - openai模型:如果下载的是openai模型,下载后直接将里面的 .pt 文件复制到 models文件夹下即可。 - - GoogleSpeech:使用google提供的语音识别服务生成字幕,需要填写代理,确保可连接到google - - -9. 配音语速:填写 -90到+90 之间的数字,同样一句话在不同语言语音下,所需时间是不同的,因此配音后可能声画字幕不同步,可以调整此处语速,负数代表降速,正数代表加速播放。 - -10. 声音、画面、字幕对齐: “配音整体语速” “配音自动加速” “视频自动慢速” “语音前后延展” - -> -> 翻译后不同语言下发音时长不同,比如中文3s,翻译为英文可能5s,导致时长和视频不一致。 -> -> 4种解决方式: -> -> 1. 设置配音整体语速,全局加速(某些TTS不支持) -> -> 2. 强制配音加速播放,以便缩短配音时长和视频对齐 -> -> 3. 强制视频慢速播放,以便延长视频时长和配音对齐。 -> -> 4. 如果前后有静音片段,则前后延展占据静音区\n实际使用中,结合此4项效果最佳 -> -> 实现原理请查看博文 https://juejin.cn/post/7343691521601290240 -> - -12. **CUDA加速**:确认你的电脑显卡为 N卡,并且已配置好CUDA环境和驱动,则开启选择此项,速度能极大提升,具体配置方法见下方[CUDA加速支持](https://github.com/jianchang512/pyvideotrans?tab=readme-ov-file#cuda-%E5%8A%A0%E9%80%9F%E6%94%AF%E6%8C%81) - -13. TTS: 可用 edgeTTS(免费) 、gtts(免费需代理)、AzureTTS、 openai TTS-1、Elevenlabs、clone-voice(免费需自建)、GPT-SoVITS(免费需自建)、自定义TTS,openai需要使用官方接口或者开通了tts-1模型的三方接口,也可选择clone-voice进行原音色配音。同时支持使用自己的tts服务,在设置菜单-自定义TTS-API中填写api地址 - -14. 点击 开始按钮 底部会显示当前进度和日志,右侧文本框内显示字幕 - -15. 字幕解析完成后,将暂停等待修改字幕,如果不做任何操作,15s后将自动继续下一步。也可以在右侧字幕区编辑字幕,然后手动点击继续合成 - -16. 将在目标文件夹中视频同名的子目录内,分别生成两种语言的字幕srt文件、原始语音和配音后的wav文件,以方便进一步处理. - -17. 设置行角色:可对字幕中的每行设定发音角色,首先左侧选好TTS类型和角色,然后点击字幕区右下方“设置行角色”,在每个角色名后面文本中中,填写要使用该角色配音的行编号,如下图: - ![](./images/p2.png) - -18. 保留背景音:如果选择该项,则会先将视频中的人声和背景伴奏分离出来,其中背景伴奏最终再和配音音频合并,最后生成的结果视频中将保留背景伴奏。**注意**:该功能基于uvr5实现,如果你没有足够的N卡GPU显存,比如8G以上,建议慎重选择,可能非常慢并非常消耗资源。如果视频比较大, 建议选择单独的视频分离工具,比如 uvr5 或 vocal-separate https://juejin.cn/post/7341617163353341986 - -19. 原音色克隆配音clone-voice:首先安装部署[clone-voice](https://github.com/jianchang512/clone-voice)项目, 下载配置好“文字->声音”模型,然后在本软件中TTS类型中选择“clone-voice”,配音角色选择“clone”,即可实现使用原始视频中的声音进行配音。使用此方式时,为保证效果,建议选择“保留背景音”,以剔除背景噪声。 - -20. 使用GPT-SoVITS配音:首先安装部署好GPT-SoVITS项目,然后启动 GPT-SoVITS的api.py,在视频翻译配音软件-设置菜单-GPT-SoVITS API 中填写接口地址、参考音频等。 - -GPT-SoVITS 自带的 api.py 不支持中英混合发音,若需支持,请 [点击去下载该文件](https://github.com/jianchang512/gptsovits-api/releases/tag/v0.1) ,将该压缩包内的 api2.py 复制到 GPT-SoVITS 根目录下,启动方法与自带的api.py一样,可参考使用教程 https://juejin.cn/post/7343138052973297702 - - -21. 在 `videotrans/chatgpt.txt` `videotrans/azure.txt` `videotrans/gemini.txt` 文件中,可分别修改 chatGPT、AzureGPT、Gemini Pro 的提示词,必须注意里面的 `{lang}` 代表翻译到的目标语言,不要删除不要修改。提示词需要保证告知AI将按行发给它的内容翻译后按行返回,返回的行数需要同发给它的行数一致。 - -22. 添加背景音乐:该功能和“保留背景音”类似,但实现方式不同,只可在“标准功能模式”和“字幕创建配音”模式下使用。 -“添加背景音乐”是预先从本地计算机中选择一个作为背景声音的音频文件,文件路径显示在右侧文本框中,在处理结束输出结果视频时,将该音频混入,最终生成的视频里会播放该背景音频文件。 - -如果同时也选中了“保留背景音”,那么原始视频里的背景音也会保留。 - -添加背景音乐后,如果又不想要了,直接在右侧文本框中删掉显示的内容即可。 - - - -# 常见问题 / [更多文档请查看 pyvideotrans.com](https://pyvideotrans.com/guide.html) - -1. 使用google翻译或者chatGPT,提示出错 +0. 打开终端窗口,分别执行这3条命令 - 国内使用google或chatGPT官方接口,都需要挂梯子 + ``` + brew install libsndfile -2. 已使用了全局代理,但看起来并没有走代理 + brew install ffmpeg - 需要在软件界面“网络代理”中设置具体的代理地址,格式为 http://127.0.0.1:端口号 + brew install git -3. 提示 FFmpeg 不存在 + brew install python@3.12 - 首先查看确定软件根目录下存在 ffmpeg.exe, ffprobe.exe 文件或是否存在ffmpeg目录,如果不存在,解压 ffmpeg.7z,将这2个文件放到软件根目录下 + ``` -4. windows上开启了 CUDA,但是提示错误 + 继续如下2条执行 - A: [首先查看详细安装方法](https://juejin.cn/post/7318704408727519270),确定你已正确安装了cuda相关工具,如果仍存在错误,[点击下载 cuBLAS](https://github.com/jianchang512/stt/releases/download/0.0/cuBLAS_win.7z),解压后将里面的dll文件复制到 C:/Windows/System32下 + ``` + export PATH="/usr/local/opt/python@3.12/bin:$PATH" - B: 若确定和A无关,那么请检查视频是否是H264编码的mp4,有些高清视频是 H265 编码的,这种不支持,可尝试在“视频工具箱”中转为H264视频 + source ~/.bash_profile ; source ~/.zshrc - C: GPU下对视频进行硬件解码编码对数据正确性要求严格,容错率几乎为0,任何一点错误都会导致失败,加上显卡型号、驱动版本、CUDA版本、ffmpeg版本不同版本之间的差异等,导致很容易出现兼容性错误。目前加了回退,GPU上失败后自动使用CPU软件编解码。失败时logs目录下日志里会记录出错信息。 + ``` -5. 提示模型不存在? - [全部模型下载地址](https://github.com/jianchang512/stt/releases/tag/0.0) - **模型分为两类:** +1. 创建不含空格和中文的文件夹,在终端中进入该文件夹。 +2. 终端中执行命令 `git clone https://github.com/jianchang512/pyvideotrans ` +3. 执行命令 `cd pyvideotrans` +4. 继续执行 `python -m venv venv` +5. 继续执行命令 `source ./venv/bin/activate`,执行完毕查看确认终端命令提示符已变成已`(venv)`开头,以下命令必须确定终端提示符是以`(venv)`开头 +6. 执行 `pip install -r requirements.txt --no-deps`,如果提示失败,执行如下2条命令切换pip镜像到阿里镜像 - 一类是适用于“faster模型”的。 + ``` + pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/ + pip config set install.trusted-host mirrors.aliyun.com + ``` - 下载解压后,会看到文件夹,类似 “models--Systran--faster-whisper-xxx”形式的,xxx代表模型名,比如 base/small/medium/large-v3等,解压后直接将该文件夹复制到此目录下即可。 + 然后重新执行 + 如果已切换到阿里镜像源,仍提示失败,请尝试执行 `pip install -r requirements.txt --ignore-installed --no-deps ` - 如果所有faster模型下载后,当前models文件夹下应该能看到这几个文件夹 +7. `python sp.py` 打开软件界面 - models--Systran--faster-whisper-tiny - models--Systran--faster-whisper-base - models--Systran--faster-whisper-small - models--Systran--faster-whisper-medium - models--Systran--faster-whisper-large-v2 - models--Systran--faster-whisper-large-v3 +8. Ubuntu 下可能还需要安装 Libxcb 库,安装命令 + ``` - 另一类是适用于"openai模型的",下载解压后,直接就是 xx.pt 文件,比如 base.pt/small.pt,/medium.pt/large-v3.pt, 直接将该pt文件复制到此文件夹内即可。 - - 如果所有openai模型下载后,当前models文件夹下应该能直接看到 base.pt, small.pt, medium.pt, large-v1.pt, large-v3.pt - -6. 提示目录不存在或权限错误 - - 在sp.exe上右键使用管理员权限打开 + sudo apt-get update + sudo apt-get install libxcb-cursor0 -7. 提示错误,但没有详细出错信息 + ``` - 打开logs目录,找到最新的log日志文件,拉到最底部,即可看到报错信息。 +[Mac下详细部署方案](https://pyvideotrans.com/mac.html) -8. large-v3模型非常慢 - 如果你没有N卡GPU,或者没有配置好CUDA环境,或者显存低于8G,请不要使用这个模型,否则会非常慢和卡顿 -9. 提示缺少cublasxx.dll文件 - 有时会遇到“cublasxx.dll不存在”的错误,此时需要下载 cuBLAS,然后将dll文件复制到系统目录下 - [点击下载 cuBLAS](https://github.com/jianchang512/stt/releases/download/0.0/cuBLAS_win.7z) ,解压后将里面的dll文件复制到 C:/Windows/System32下 - - [cuBLAS.and.cuDNN_win_v4](https://github.com/Purfview/whisper-standalone-win/releases/download/libs/cuBLAS.and.cuDNN_win_v4.7z) +# Linux 源码部署 +0. CentOS/RHEL系依次执行如下命令安装 python3.12 -11. 怎样使用自定义音色 - - 设置菜单-自定义TTS-API,填写自己的tts服务器接口地址。 - - 将以POST请求向填写的API地址发送application/www-urlencode数据: ``` -# 发送的请求数据: -text:需要合成的文本/字符串 +sudo yum update -language:文字所属语言代码(zh-cn,zh-tw,en,ja,ko,ru,de,fr,tr,th,vi,ar,hi,hu,es,pt,it)/字符串 +sudo yum groupinstall "Development Tools" -voice:配音角色名称/字符串 +sudo yum install openssl-devel bzip2-devel libffi-devel -rate:加减速值,0或者 '+数字%' '-数字%',代表在正常速度基础上进行加减速的百分比/字符串 +cd /tmp -ostype:win32或mac或linux操作系统类型/字符串 +wget https://www.python.org/ftp/python/3.12.0/Python-3.12.0.tgz -extra:额外参数/字符串 +tar xzf Python-3.12.0.tgz +cd Python-3.12.0 -# 期待从接口返回json格式数据: -{ - code:0=合成成功时,>0的数字代表失败 - msg:ok=合成成功时,其他为失败原因 - data:在合成成功时,返回mp3文件的完整url地址,用于在软件内下载。失败时为空 -} +./configure — enable-optimizations +sudo make && sudo make install -``` - -13. 字幕语音无法对齐 +sudo alternatives — install /usr/bin/python3 python3 /usr/local/bin/python3.12 2 -> 翻译后不同语言下发音时长不同,比如一句话中文3s,翻译为英文可能5s,导致时长和视频不一致。 -> -> 2种解决方式: -> -> 1. 强制配音加速播放,以便缩短配音时长和视频对齐 -> -> 2. 强制视频慢速播放,以便延长视频时长和配音对齐。 -> - -14. 字幕不显示或显示乱码 - -> -> 采用软合成字幕:字幕作为单独文件嵌入视频,可再次提取出,如果播放器支持,可在播放器字幕管理中启用或禁用字幕; -> -> 注意很多国内播放器必须将srt字幕文件和视频放在同一目录下且名字相同,才能加载软字幕,并且可能需要将srt文件转为GBK编码,否则显示乱码, -> - -15. 如何切换软件界面语言/中文or英文 - -打开软件目录下 videotrans/set.ini 文件,然后将 `lang=` 后填写语言代码,`zh`代表中文,`en`代表英文,修改后重启软件 +sudo yum install -y ffmpeg ``` -;The default interface follows the system and can also be specified manually here, zh=Chinese interface, en=English interface. -;默认界面跟随系统,也可以在此手动指定,zh=中文界面,en=英文界面 -lang = - +## Ubuntu/Debian系执行如下命令安装python3.12 ``` -16. 尚未执行完毕就闪退 - -如果启用了cuda并且电脑已安装好了cuda环境,但没有手动安装配置过cudnn,那么会出现该问题,去安装和cuda匹配的cudnn。比如你安装了cuda12.3,那么就需要下载cudnn for cuda12.x压缩包,然后解压后里面的3个文件夹复制到cuda安装目录下。具体教程参考 -https://juejin.cn/post/7318704408727519270 - - -如果cudnn按照教程安装好了仍闪退,那么极大概率是GPU显存不足,可以改为使用 medium模型,显存不足8G时,尽量避免使用largev-3模型,尤其是视频大于20M时,否则可能显存不足而崩溃 - - -17. 如何调节字幕字体大小 +apt update && apt upgrade -y -如果嵌入硬字幕,可以通过修改 videotrans/set.ini 中的 fontsize=0为一个合适的值,来调节字体大小。0代表默认尺寸,20代表字体尺寸为20个像素 +apt install software-properties-common -y -18. macos报错 +add-apt-repository ppa:deadsnakes/ppa -OSError: ctypes.util.find_library() did not manage to locate a library called 'sndfile' +apt update -解决办法: +sudo apt-get install libxcb-cursor0 -找到libsndfile安装位置,通过brew安装的话一般在:/opt/homebrew/Cellar/libsndfile, -然后将该路径添加到环境变量:export DYLD_LIBRARY_PATH=/opt/homebrew/Cellar/libsndfile/1.2.2/lib:$DYLD_LIBRARY_PATH +apt install python3.12 +curl -sS https://bootstrap.pypa.io/get-pip.py | python3.12 -19. GPT-SoVITS API 不支持 中英 混合发音 +pip 23.2.1 from /usr/local/lib/python3.12/site-packages/pip (python 3.12) -GPT-SoVITS 自带的 api.py 不支持中英混合发音,若需支持,请 [点击下载该文件](https://github.com/jianchang512/stt/releases/download/0.0/GPT-SoVITS.api.py.zip) ,用该压缩包内的 api.py,覆盖 GPT-SoVITS 自带的api.py +sudo update-alternatives --install /usr/bin/python python /usr/local/bin/python3.12 1 -20. 是否有详细教程 +sudo update-alternatives --config python -有文档网站 https://pyvideotrans.com ,不过这里传图不便,因此更新较慢,请查看掘金博客获取最新教程, https://juejin.cn/user/4441682704623992/columns - -或关注我的微信公众号,内容基本等同掘金博客,微信搜一搜查看公众号 `pyvideotrans` - - - -# 高级设置 videotrans/set.ini - - -**请勿随意调整,除非你知道将会发生什么** +apt-get install ffmpeg ``` -;#################### -;如果你不确定修改后将会带来什么影响,请勿随意修改,修改前请做好备份, 如果出问题请恢复 -;升级前请做好备份,升级后按照原备份重新修改。请勿直接用备份文件覆盖,因为新版本可能有新增配置 -;If you are not sure what effect the modification will bring, please don't modify it arbitrarily, please make a good backup before modification, and restore it if something goes wrong. -;Please make a backup before upgrading, and then modify it according to the original backup after upgrading. Please do not overwrite the backup file directly, because the new version may have new configurations. - -;##############界面语言文字############################# -;Interface language text ############################# -;默认界面跟随系统,也可以在此手动指定,zh=中文界面,en=英文界面 -;Default interface follows the system, you can also specify it manually here, zh=Chinese interface, en=English interface. -lang = - -;##################视频质量############################ -;Video quality ############################ -;视频处理质量,0-51的整数,0=无损处理尺寸较大速度很慢,51=质量最低尺寸最小处理速度最快 -;Video processing quality, integer 0-51, 0=lossless processing with large size is very slow, 51=lowest quality with smallest size is fastest processing speed -crf=13 -;#################模型名字列表################################# -;List of model names ################################# -;可供选择的chatGPT模型,以英文逗号分隔 -;Available chatGPT models, separated by English commas -chatgpt_model=gpt-3.5-turbo,gpt-4,gpt-4-turbo-preview,qwen,moonshot-v1-8k - -;################声画字幕对齐相关################################# -;Sound and picture subtitle alignment related ################################# - -;音频最大加速倍数,默认1.5,即最大加速到 1.5倍速度,需设置1-100的数字,比如1.5,代表最大加速1.5倍 -;Maximum audio acceleration, default 1.5, that is, the maximum acceleration to 1.5 times the speed, need to set the number of 1-100, such as 1.5, represents the maximum acceleration 1.5 times -audio_rate=1.5 - -; 设为大于1的数,代表最大允许慢速多少倍,0或1代表不进行视频慢放 -; set to a number greater than 1, representing the maximum number of times allowed to slow down, 0 or 1 represents no video slowdown -video_rate=20 - -;是否移除配音末尾空白,true=移除,false=不移除 -;Whether to remove voiceover end blanks, true=remove, false=don't remove -remove_silence=true - -;是否移除原始字幕时长大于配音时长 的静音,比如原时长5s,配音后3s,是否移除这2s静音,true=移除,false=不移除 -;If or not remove the silence when the original duration of subtitle is longer than the dubbing duration, for example, if the original duration is 5s and the dubbing duration is 3s, if or not remove the 2s of silence, true=remove, false=don't remove. -remove_srt_silence=false - -;移除2条字幕间的静音长度ms,比如100ms,即如果两条字幕间的间隔大于100ms时,将移除100ms -; Remove the mute length of ms between 2 subtitles, e.g. 100ms, i.e. if the interval between two subtitles is greater than 100ms, 100ms will be removed -remove_white_ms=0 - - -;true=强制修改字幕时间轴以便匹配声音,false=不修改,保持原始字幕时间轴,不修改可能导致字幕和声音不匹配 -;true=Forces the subtitle timeline to be modified in order to match the sound, false=Does not modify it, keeps the original subtitle timeline, not modifying it may result in a mismatch between the subtitle and the sound -force_edit_srt=true - -; ###############语句分割相关################################## -; statement segmentation related ################################## - -;用于 预先分割 和 整体识别 时,作为切割依据的最小静音片段ms,默认200ms 以及最大句子时长3s -;The minimum silent segmentation ms, default 200ms, and the maximum sentence length 3s are used for pre-segmentation and overall recognition as the basis for segmentation. -overall_silence=200 -overall_maxsecs=3 - -;用于均等分割时,作为切割依据的最小静音片段ms,默认200ms,即只有大于等于200ms的静音处才分割 -; used for equal segmentation, as the basis for cutting the minimum silence segment ms, the default 200ms, that is, only greater than or equal to 200ms silence at the split -voice_silence=200 -;用于均等分割时的每个切片时长 秒,默认 6s,即每个字幕时长大约都是6s -;seconds per slice for equalization, default 6s, i.e. each subtitle is about 6s. -interval_split=6 - - -;################翻译配音速度############################# -;Translation dubbing speed ############################# - -;同时翻译的数量,1-20,不要太大,否则可能触发翻译api频率限制 -;Translation dubbing speed ############################# -trans_thread=15 - -;翻译出错重试次数 -;Number of retries for translation errors -retries=2 - -;同时配音的数量,1-10,建议不要大于5,否则容易失败 -; The number of simultaneous voiceovers, 1-10, it is recommended not to be greater than 5, otherwise it will be easy to fail -dubbing_thread=5 - - -;字幕识别完成等待翻译前的暂停秒数,和翻译完等待配音的暂停秒数 -; seconds of pause before subtitle recognition is completed and waiting for translation, and seconds of pause after translation and waiting for dubbing. -countdown_sec=15 - - -;#####################背景声音######################################## -;Background sound ######################################## - -;背景声音音量降低或升高幅度,大于1升高,小于1降低 -; Background sound volume is lowered or raised, greater than 1 raised, less than 1 lowered -backaudio_volume=0.5 - -;背景音分离时切分片段,太长的音频会耗尽显存,因此切分后分离,单位s,默认 600s -;Background sound is separated by a slice, if the audio is too long, it will exhaust the memory, so it is separated by a slice, the unit is s, default is 600s. -separate_sec=600 - -; 如果背景音频时长短于视频,是否重复播放背景音,默认否 -;Background sound is separated by a slice, if the audio is too long, it will exhaust the memory, so it is separated by a slice, the unit is s, default is 600s. -loop_backaudio=false - - -;####################GPU FFmpeg ##################################### - -;硬件编码设备,cuvid或cuda -; Hardware encoding device, cuvid or cuda -hwaccel=cuvid - -;硬件输出格式,nv12或cuda -; Hardware encoding device, cuvid or cuda -hwaccel_output_format=nv12 - -;是否禁用硬件解码,true=禁用,兼容性好;false=启用,可能某些硬件上有兼容错误 -;Whether to disable hardware decoding, true=disable, good compatibility; false=enable, there may be compatibility errors on some hardware. -no_decode=true +**打开任意一个终端,执行 `python3 -V`,如果显示 “3.12.0”,说明安装成功,否则失败** +1. 创建个不含空格和中文的文件夹, 从终端打开该文件夹。 +3. 终端中执行命令 `git clone https://github.com/jianchang512/pyvideotrans` +4. 继续执行命令 `cd pyvideotrans` +5. 继续执行 `python -m venv venv` +6. 继续执行命令 `source ./venv/bin/activate`,执行完毕查看确认终端命令提示符已变成已`(venv)`开头,以下命令必须确定终端提示符是以`(venv)`开头 +7. 执行 `pip install -r requirements.txt --no-deps`,如果提示失败,执行如下2条命令切换pip镜像到阿里镜像 -; ##################字幕识别-GPU提高降低性能相关############################################ -;Subtitle Recognition - GPU Improvement Reduced Performance Related -;从视频中识别字幕时的cuda数据类型,int8=消耗资源少,速度快,精度低,float32=消耗资源多,速度慢,精度高,int8_float16=设备自选 -; cuda data type when recognizing subtitles from video, int8=consumes fewer resources, faster, lower precision, float32=consumes more resources, slower, higher precision, int8_float16=device of choice -cuda_com_type=float32 + ``` -;中文语言的视频时,用于识别的提示词,可解决简体识别为繁体问题。但注意,有可能直接会将提示词作为识别结果返回 -;The prompt words used to recognize videos in Chinese language can solve the problem of recognizing simplified Chinese as traditional Chinese. But note that there is a possibility that the prompt word will be returned directly as the result of the recognition. -initial_prompt_zh= + pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/ + pip config set install.trusted-host mirrors.aliyun.com -;字幕识别时,cpu进程 -;cpu process during subtitle recognition -whisper_threads=4 + ``` -;字幕识别时,同时工作进程 -; Simultaneous work processes during subtitle recognition -whisper_worker=1 + 然后重新执行,如果已切换到阿里镜像源,仍提示失败,请尝试执行 `pip install -r requirements.txt --ignore-installed --no-deps ` +8. 如果要使用CUDA加速,分别执行 -;字幕识别时精度调整,1-5,1=消耗资源最低,5=消耗最多,如果显存充足,可以设为5,可能会取得更精确的识别结果 -;Subtitle recognition accuracy adjustment, 1-5, 1 = consume the lowest resources, 5 = consume the most, if the video memory is sufficient, you can set it to 5, you may get more accurate recognition results. -beam_size=5 -best_of=5 + `pip uninstall -y torch torchaudio` -;faster-whisper字幕整体识别模式时启用自定义静音分割片段,true=启用,显存不足时,可以设为false禁用 -;Enable custom mute segmentation when subtitles are in overall recognition mode, true=enable, can be set to false to disable when video memory is insufficient. -vad=true -;0=占用更少GPU资源但效果略差,1=占用更多GPU资源同时效果更好 -;0 = less GPU resources but slightly worse results, 1 = more GPU resources and better results at the same time -temperature=1 + `pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118` -;同 temperature, true=占用更多GPU效果更好,false=占用更少GPU效果略差 -; same as temperature, true=better with more GPUs, false=slightly worse with fewer GPUs -condition_on_previous_text=false + `pip install nvidia-cublas-cu11 nvidia-cudnn-cu11` +9. linux 如果要启用cuda加速,必须有英伟达显卡,并且配置好了CUDA11.8+环境,具体安装见 [CUDA加速支持](https://github.com/jianchang512/pyvideotrans?tab=readme-ov-file#cuda-%E5%8A%A0%E9%80%9F%E6%94%AF%E6%8C%81) +10. `python sp.py` 打开软件界面 -; ###################字幕设置相关 Subtitle Settings###################################### -;硬字幕时可在这里设置字幕字体大小,填写整数数字,比如12,代表字体12px大小,20代表20px大小,0等于默认大小 -;Hard subtitles can be set here when the subtitle font size, fill in the integer numbers, such as 12, on behalf of the font size of 12px, 20 on behalf of the size of 20px, 0 is equal to the default size -fontsize=16 +# Window10/11 源码部署 -;中日韩字幕一行长度字符个数,多于这个将换行 -;CJK subtitle line length character count, more than this will be line feeds. -cjk_len=30 +0. 打开 https://www.python.org/downloads/ 下载 windows3.12,下载后双击,一路next,注意要选中“Add to PATH” -;其他语言换行长度,多于这个字符数量将换行 -;Other language line breaks, more than this number of characters will be a line break. -other_len=60 + **打开一个cmd,执行 `python -V`,如果输出不是 `3.12.3`,说明安装出错,或没有加入 `Add to PATH`,请重新安装** -;用于兼容ffmpeg,如果出现ffmpeg报错,错误中含有 vysnc字样,可改为 vsync=vfr -; used for ffmpeg compatibility, if ffmpeg error, the error contains the word vysnc, can be changed to vsync=vfr -vsync=passthrough +1. 打开 https://github.com/git-for-windows/git/releases/download/v2.45.0.windows.1/Git-2.45.0-64-bit.exe ,下载git,下载后双击一路下一步。 +2. 找个不含空格和中文的文件夹,地址栏中输入 `cmd`回车,打开终端,以下命令均在该终端中执行 +3. 执行命令 `git clone https://github.com/jianchang512/pyvideotrans` +4. 继续执行命令 `cd pyvideotrans` +5. 继续执行 `python -m venv venv` +6. 继续执行命令 `.\venv\scripts\activate`,执行后请查看确认命令行开头已变成了`(venv)`,否则说明出错 +7. 如果要使用CUDA加速,分别执行 -;当配音长度大于视频长度时,是否延长视频,true=延长,false=不延长,将截断音频 -;If or not extend the video when the dubbing length is greater than the video length, true=extend, false=don't extend, the audio will be truncated. -append_video=true + `pip uninstall -y torch torchaudio` -;true=批量任务时分为 识别、翻译、配音、合并 多阶段交叉执行,加快速度,false=前面全部完成后才开始下一个 -;true=The batch task is divided into recognition, translation, dubbing, merging, and multi-stage cross-execution to accelerate the speed, false=The next one starts after all the previous ones are completed. -cors_run=true + `pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118` -``` +8. windows 如果要启用cuda加速,必须有英伟达显卡,并且配置好了CUDA11.8+环境,具体安装见 [CUDA加速支持](https://github.com/jianchang512/pyvideotrans?tab=readme-ov-file#cuda-%E5%8A%A0%E9%80%9F%E6%94%AF%E6%8C%81) +9. Linux 如果要使用 CUDA 加速,还需要额外执行安装 `pip install nvidia-cublas-cu11 nvidia-cudnn-cu11` +10. 解压 ffmpeg.zip 到当前源码目录下,提示覆盖则覆盖,解压后确保源码下的ffmepg文件夹内能看到 ffmpeg.exe ffprobe.exe ytwin32.exe, -# CUDA 加速支持 +11. `python sp.py` 打开软件界面 -**安装CUDA工具** [windows上详细安装方法](https://juejin.cn/post/7318704408727519270) -必须 cuda和cudnn都安装好,否则可能会闪退。 -Linux上 `pip install nvidia-cublas-cu11 nvidia-cudnn-cu11` +# 使用教程和文档 +请查看 https://pyvideotrans.com/guide.html -安装完成后执行 `python testcuda.py` 或 双击 testcuda.exe 如果输出均是 ok ,说明可用 -有时会遇到“cublasxx.dll不存在”的错误, 或者未遇到此错误,并且CUDA配置正确,但始终出现识别错误,需要下载 cuBLAS,然后将dll文件复制到系统目录下 +# 语音识别模型: -[点击下载 cuBLAS](https://github.com/jianchang512/stt/releases/download/0.0/cuBLAS_win.7z),解压后将里面的dll文件复制到 C:/Windows/System32下 + 下载地址: https://pyvideotrans.com/model.html + 说明和区别介绍地址:https://pyvideotrans.com/02.html diff --git a/README_EN.md b/README_EN.md index 45f7133f..c6b6110e 100644 --- a/README_EN.md +++ b/README_EN.md @@ -1,604 +1,262 @@ -[简体中文](./README.md) / [👑Donate to this project](./about.md) +[简体中文](./README.md) / [👑 Donate to this project](./about.md) # Video Translation and Dubbing Tool > -> This is a video translation and dubbing tool that can translate a video from one language to a specified language, automatically generating and adding subtitles and dubbing in that language. +> This is a video translation and dubbing tool that can translate a video from one language to a specified language and automatically generate and add subtitles and dubbing in that language. > -> Voice recognition uses `faster-whisper` `openai-whisper` offline models. +> The voice recognition supports `faster-whisper`, `openai-whisper`, `GoogleSpeech`, `zh_recogn Ali Chinese voice recognition model`. > -> Text translation supports `microsoft | google | baidu | tencent | chatGPT | Azure | Gemini | DeepL | DeepLX | Offline translation OTT`, +> Text translation supports `Microsoft Translate | Google Translate | Baidu Translate | Tencent Translate | ChatGPT | AzureAI | Gemini | DeepL | DeepLX | Offline translation OTT`, and includes a free ChatGPT API translation interface sponsored by (apiskey.top). > -> Text-to-speech synthesis supports `Microsoft Edge tts` `Openai TTS-1` `Elevenlabs TTS` `Custom TTS server API` `GPT-SoVITS` [clone-voice](https://github.com/jianchang512/clone-voice) +> Text-to-speech synthesis supports `Microsoft Edge tts`, `Google tts`, `Azure AI TTS`, `Openai TTS`, `Elevenlabs TTS`, `Custom TTS server API`, `GPT-SoVITS`, [clone-voice](https://github.com/jianchang512/clone-voice). > > Allows retaining background accompaniment music, etc. (based on uvr5) -> -> Supported languages: Simplified and Traditional Chinese, English, Korean, Japanese, Russian, French, German, Italian, Spanish, Portuguese, Vietnamese, Thai, Arabic, Turkish, Hungarian, Hindi +> +> Supported languages: Simplified and Traditional Chinese, English, Korean, Japanese, Russian, French, German, Italian, Spanish, Portuguese, Vietnamese, Thai, Arabic, Turkish, Hungarian, Hindi, Ukrainian, Kazakh, Indonesian, Malay. # Main Uses and How to Use -【Translate videos and dubbing】Set various options as needed, freely configure combinations, to achieve translation and dubbing, automatic acceleration/deceleration, merging, etc. - -【Recognize subtitles without translation】Select a video file, choose the source language of the video, and then 【recognize text from video voice】and automatically export subtitle files to the target folder. - -【Extract subtitles and translate】Select a video file, choose the source language, set the target language you want to translate to, then 【recognize text from video voice】and translate it to the target language, then export bilingual subtitle files to the target folder. - -【Combine subtitles and video】Select a video, then drag and drop existing subtitle files to the right-hand subtitle area, set both the source and target languages to the language used in the subtitles, then choose the dubbing type and character, and start execution. +【Translate videos and dub】Translate the audio in videos into the voice of another language and embed subtitles in that language -【Create dubbing for subtitles】Drag and drop local subtitle files to the right-side subtitle editor, then choose the target language, dubbing type and character, to generate the dubbed audio file to the target folder. +【Convert audio or video to subtitles】Recognize human speech in audio or video files as text and export to srt subtitle files -【Recognize text from audio and video】Drag and drop videos or audios to the recognition window, to recognize the text and export it in SRT subtitle format. +【Batch subtitle creation and dubbing】Create dubbing based on local existing srt subtitle files, supporting single or batch subtitles -【Synthesize speech from text】Take a section of text or subtitles, and use the specified dubbing character to generate dubbing. +【Batch subtitle translation】Translate one or more srt subtitle files into subtitles in other languages -【Separate audio from video】Separate video files into audio files and silent videos. +【Audio, video, and subtitle merge】Merge audio files, video files, and subtitle files into one video file -【Combine audio, video, and subtitles】Merge audio files, video files, and subtitle files into one video file. +【Extract audio from video】Extract as audio files and mute video from video -【Audio and video format conversion】Mutual conversion between various formats. +【Audio and video format conversion】Mutual conversion between various formats -【Text subtitle translation】Translate text or SRT subtitle files into other languages. - -【Separate vocal and background music】Separate the human voice and background music in the video into two audio files. - -【Download YouTube videos】Download videos from YouTube. +【Download YouTube videos】Download videos from YouTube ---- -https://github.com/jianchang512/pyvideotrans/assets/3378335/dd3b6a33-2b64-4cab-b556-79f768b111c5 - - -[Youtube demo](https://www.youtube.com/playlist?list=PLVWPFvHklPATE7g3z18JWybF95-ODSDD9) - - - -# Download precompiled exe version for Windows (other systems use source code deployment) - -0. [Click to download the precompiled version, unzip, then double-click sp.exe](https://github.com/jianchang512/pyvideotrans/releases) - -1. Unzip to an English path and ensure the path does not contain spaces. After unzipping, double-click sp.exe (if you encounter permission issues, you can right-click to open with administrator privileges) - -3. It is not anti-virus proof, false positives from anti-virus software may occur, which can be ignored or you can deploy from the source code - -4. Note: It must be used after unzipping, do not double-click to use it directly within the zip package, and do not move the sp.exe file to other locations after unzipping - - - -# Source Code Deployment - -1. Set up a Python environment from 3.10 to 3.11, recommended 3.10 -2. `git clone https://github.com/jianchang512/pyvideotrans` -3. `cd pyvideotrans` -4. `python -m venv venv` -5. On Win execute `%cd%/venv/scripts/activate`, on Linux and Mac execute `source ./venv/bin/activate` -6. `pip install -r requirements.txt`, if you encounter version conflict errors, use `pip install -r requirements.txt --no-deps` -7. If you want to enable CUDA acceleration on Windows and Linux, continue with `pip uninstall -y torch` to uninstall, then run `pip install torch==2.1.2 --index-url https://download.pytorch.org/whl/cu121`. (Must have Nvidia card and CUDA environment configured) -8. Additionally on Linux, if using CUDA acceleration, install with `pip install nvidia-cublas-cu11 nvidia-cudnn-cu11` - -9. Unzip ffmpeg.zip to the root directory on Win (ffmpeg.exe file), and on Linux and Mac, please install ffmpeg yourself, you can "Baidu or Google" for specific methods - -10. `python sp.py` to open the software interface - -11. If CUDA acceleration support is required, an NVIDIA graphics card is needed on the device. For specific installation precautions, see below [CUDA Acceleration Support](https://github.com/jianchang512/pyvideotrans?tab=readme-ov-file#cuda-%E5%8A%A0%E9%80%9F%E6%94%AF%E6%8C%81) - -12. On Ubuntu, you might also need to install the Libxcb library, use the following command: - -``` -sudo apt-get update -sudo apt-get install libxcb-cursor0 -``` - -13. On Mac, you might need to run `brew install libsndfile` to install libsndfile - -# How to Use - -1. Choose video: Click to select mp4/avi/mov/mkv/mpeg videos, multiple videos can be selected; - -2. Save to..: If not chosen, it defaults to generating in the `_video_out` in the same directory, and at the same time, bilingual subtitle files in the original and target languages will be created in the srt folder in that directory - -3. Translation channel: Choose from microsoft | google | baidu | tencent | chatGPT | Azure | Gemini | DeepL | DeepLX | OTT translation channels - -4. Proxy address: If your region cannot directly access Google/chatGPT, you need to set up a proxy in the Network Proxy of the software interface, for example, if using v2ray, fill in `http://127.0.0.1:10809`, if clash, then fill in `http://127.0.0.1:7890`. If you have changed the default port or are using other proxy software, fill in as needed - -5. Original language: Choose the language category of the video to be translated - -6. Target language: Choose the language category into which you want to translate - -7. TTS and dubbing character: After choosing the target language for translation, you can choose from the dubbing options, the dubbing character; - - Hard subtitles: - Refers to subtitles that are always displayed and cannot be hidden, if you want subtitles to be displayed during web play, please choose hard subtitle embedding, which can be set through font size in videotrans/set.ini - - Hard subtitles (double): - Display target language subtitles and original language subtitles on separate lines - - Soft subtitles: - If the player supports subtitle management, you can display or hide subtitles. This method will not show subtitles when playing on the web, and some domestic players may not support it. You need to place the generated video's name-matching srt file and video in one directory to display - - Soft subtitles (double): - Will embed subtitles in two languages, which can be switched via the subtitle display/hide function of the player - -8. Voice recognition model: Choose from base/small/medium/large-v2/large-v3, with recognition results getting better, but recognition speed getting slower and memory requirements getting larger each time; built-in base model, other models need to be downloaded separately, unzipped and put into the "current software directory/models" directory. If the GPU memory is less than 4G, do not use large-v3 - - Overall recognition: The model will automatically handle sentence segmentation for the entire audio. Do not choose overall recognition for large videos to avoid crashes due to insufficient video memory - - Pre-split: Suitable for very large videos, first cut into 1-minute segments for sequential recognition and segmentation - - Equal division: Split the video into equal seconds, with each subtitle being approximately equal in length, controlled by interval_split in set.ini - - [Download all models](https://github.com/jianchang512/stt/releases/tag/0.0) - - **Special Attention** - - Faster models: If you have downloaded the faster model, after unzipping, copy the "models--Systran--faster-whisper-xx" folder inside the zip to the models directory. After unzipping and copying, the folder list under models directory should look like this - ![](https://github.com/jianchang512/stt/assets/3378335/5c972f7b-b0bf-4732-a6f1-253f42c45087) - - Openai models: If you have downloaded the openai model, after unzipping, directly copy the .pt files inside to the models folder. - -9. Dubbing pace: Fill in a number between -90 and +90, the same sentence requires different amounts of time under different language voices, therefore dubbing may result in asynchrony of voice, picture, and subtitles. You can adjust the rate here, negative numbers mean decelerate, positive numbers mean accelerate playback. - -10. Aligning sound, picture, and subtitles: "Dubbing pace" "Automatic dubbing acceleration" "Automatic video deceleration" "Voice extension before and after" - -> -> After translation, the pronunciation duration under different languages is different, for example, a sentence that takes 3s in Chinese might take 5s in English, leading to inconsistency in duration and video. -> -> 4 ways to solve this: -> -> 1. Set the dubbing rate, global acceleration (some TTS do not support) -> -> 2. Force dubbing acceleration to shorten the dubbing duration and align with the video -> -> 3. Force video slow down to extend the video duration and align with the dubbing. -> -> 4. If there are silent sections before and after, then extend to cover the silent area\n In actual use, the best effect is achieved by combining these 4 items -> -> For the principle implementation, please refer to the blog post https://juejin.cn/post/7343691521601290240 -> - -12. **CUDA Acceleration**: Confirm that your computer's graphics card is an Nvidia card, and you have configured the CUDA environment and driver, then select this option, the speed can be greatly improved. For specific configuration methods, see below [CUDA Acceleration Support](https://github.com/jianchang512/pyvideotrans?tab=readme-ov-file#cuda-%E5%8A%A0%E9%80%9F%E6%94%AF%E6%8C%81) +https://github.com/jianchang512/pyvideotrans/assets/3378335/3811217a-26c8-4084-ba24-7a95d2e13d58 -13. TTS: You can use edgeTTS and openai TTS-1 models, Elevenlabs, clone-voice, custom TTS, openai requires using the official interface or a third-party interface that has activated the tts-1 model, or choose clone-voice for original timbre dubbing. Also supports using your own tts service, fill in the API address in Settings menu - Custom TTS-API +# Pre-Packaged Version (Only available for Win10/Win11, MacOS/Linux systems use source code deployment) -14. Click the Start button the current progress and log will be displayed at the bottom, and the subtitles will be displayed in the text box on the right +> Packaged with pyinstaller, not anti-virus whitelisted or signed, anti-virus software may flag it, please add to trusted list or use source code deployment -15. After the subtitle analysis is completed, it will pause awaiting editing of the subtitles, if no operation is performed, it will automatically continue to the next step after 30s. You can also edit the subtitles in the right subtitle area and then manually click to continue synthesis +0. [Click to download the pre-packaged version, unzip to an English directory without spaces, then double-click sp.exe (https://github.com/jianchang512/pyvideotrans/releases) -16. In the subdirectory of the target folder named after the video, separate srt files in both languages, the original voice and dubbed wav files will be generated for further processing. +1. Unzip to an English path, and ensure the path does not contain spaces. After unzipping, double-click sp.exe (if you encounter permission issues, right-click to open as administrator) -17. Set the role for the line: You can set a dubbing role for each line in the subtitle, first choose the TTS type and character on the left, then click "Set role for the line" in the lower right corner of the subtitle area, in the text behind each character name, fill in the line number that will use that character for dubbing, as shown in the following picture: - ![](./images/p2.png) +4. Note: Must be used after extracting, cannot be used directly within the compressed package, nor can the sp.exe file be moved to another location after extraction -18. Retain background music: If this option is selected, then the video's human voice and background accompaniment will first be separated, in which the background accompaniment will eventually be merged with the dubbing audio, and the final result video will retain the background accompaniment. **Note**: This function is based on uvr5. If you do not have enough Nvidia GPU memory, such as 8GB or more, it is recommended to choose carefully as it may be very slow and consume a lot of resources. If the video is relatively large, it is recommended to choose a separate video separation tool, such as uvr5 or vocal-separate https://juejin.cn/post/7341617163353341986 -19. Original timbre clone dubbing clone-voice: First install and deploy the [clone-voice](https://github.com/jianchang512/clone-voice) project, download and configure the "text-to-sound" model, then in this software's TTS type, choose "clone-voice" and the dubbing character selects "clone" to realize dubbing using the original video's voice. When using this method, it is recommended to select "Retain background music" to eliminate background noise for better effect. +# MacOS Source Code Deployment -20. Using GPT-SoVITS for dubbing: First install and deploy the GPT-SoVITS project, then start the GPT-SoVITS's api.py, and fill in the interface address and reference audio, etc., in the video translation and dubbing software's settings menu - GPT-SoVITS API. +0. Open a terminal window and execute these 3 commands respectively -GPT-SoVITS's own api.py does not support mixed Chinese and English pronunciation. If support is needed, please [click to download this file](https://github.com/jianchang512/gptsovits-api/releases/tag/v0.1), copy the api2.py inside the compressed package to the root directory of GPT-SoVITS, and start it the same way as the original api.py, you can refer to the usage tutorial https://juejin.cn/post/7343138052973297702 + ``` + brew install libsndfile -21. In `videotrans/chatgpt.txt` `videotrans/azure.txt` `videotrans/gemini.txt` respectively, you can edit the chatGPT, AzureGPT, Gemini Pro prompts, you must pay attention to `{lang}` representing the target language for translation, do not delete or modify. Make sure the prompts tell AI to return content line by line after translating it, and the number of returned lines must match the number of lines sent. + brew install ffmpeg -22. Adding Background Music: This function is similar to "Retaining Background Music," but the implementation method is different, it can only be used in "Standard Function Mode" and "Subtitles Create Dubbing" mode. -"Adding Background Music" pre-selects an audio file from the local computer as the background sound, which is displayed in the text box on the right, and when processing the result video, the audio is mixed in, the final video will play the background audio file. + brew install git -If "Retain Background Music" is also selected, the original video's background music will also be retained. + brew install python@3.12 -After adding background music, if you no longer want it, simply delete the content displayed in the text box on the right. + ``` + Then proceed with the following 2 commands -# Frequently Asked Questions + ``` + export PATH="/usr/local/opt/python@3.12/bin:$PATH" -1. Error prompted when using Google Translate or chatGPT + source ~/.bash_profile; source ~/.zshrc - In China, both Google and the official chatGPT interface require a VPN. + ``` -2. Global proxy is used, but it doesn't seem to work +1. Create a folder without spaces and Chinese characters, and enter that folder in the terminal. +2. Execute the command `git clone https://github.com/jianchang512/pyvideotrans` in the terminal. +3. Execute the command `cd pyvideotrans`. +4. Continue with `python -m venv venv`. +5. Continue with the command `source ./venv/bin/activate`, confirming that the terminal prompt starts with `(venv)`, the following commands must be sure the terminal prompt starts with `(venv)`. +6. Execute `pip install -r requirements.txt --no-deps`, if there's a failure prompt, switch to Aliyun mirror source and execute the following 2 commands - You need to set the specific proxy address in the software interface "Network Proxy", formatted as http://127.0.0.1:port number + ``` + pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/ + pip config set install.trusted-host mirrors.aliyun.com + ``` -3. FFmpeg does not exist prompt + Then retry. If the failure persists, try `pip install -r requirements.txt --ignore-installed --no-deps`. - First, check to make sure that the ffmpeg.exe, ffprobe.exe files exist in the root directory of the software - -4. CUDA is activated on Windows but is showing errors +7. `python sp.py` to open the software interface. - A: [First, check the detailed installation method](https://juejin.cn/post/7318704408727519270), to ensure that you have installed the required CUDA tools correctly. If errors persist, [download cuBLAS](https://github.com/jianchang512/stt/releases/download/0.0/cuBLAS_win.7z), unzip it, and copy the dll files to C:/Windows/System32. +8. Ubuntu may also need to install the Libxcb library, installation commands are: - B: If it is confirmed not related to A, then check if the video is encoded in H264 mp4. Some HD videos are encoded in H265, which is not supported. Try converting to H264 video in the "Video Toolbox". + ``` - C: Under GPU, hardware decoding and encoding of videos have strict requirements for data accuracy, with almost zero tolerance for errors. Even a slight mistake can lead to failure, and due to differences in graphics card models, driver versions, CUDA versions, ffmpeg versions, compatibility errors can easily occur. Currently, a fallback has been added; if the process fails on the GPU, it will automatically revert to CPU software codec. Error information will be recorded in the logs directory when a failure occurs. + sudo apt-get update + sudo apt-get install libxcb-cursor0 -5. Getting a 'model does not exist' prompt? + ``` - [Download all models here](https://github.com/jianchang512/stt/releases/tag/0.0) +[Detailed MacOS deployment scheme](https://pyvideotrans.com/mac.html) - **Models are divided into two categories:** +# Linux Source Code Deployment - One category is for the "faster models". - - After downloading and unzipping, you will see folders in the format "models--Systran--faster-whisper-xxx", where xxx represents the model name, such as base/small/medium/large-v3, etc. Just copy that folder directly into this directory. - - After downloading all faster models, the current models folder should contain these folders: - - models--Systran--faster-whisper-base - models--Systran--faster-whisper-small - models--Systran--faster-whisper-medium - models--Systran--faster-whisper-large-v2 - models--Systran--faster-whisper-large-v3 - - The other category is for "openai models", which after downloading and unzipping, give you .pt files directly, like base.pt/small.pt/medium.pt/large-v3.pt. Copy this pt file directly into this folder. - - After downloading all openai models, the current models folder should show base.pt, small.pt, medium.pt, large-v1.pt, large-v3.pt directly. - -6. Prompt for 'directory does not exist or permission error' - - Right-click on sp.exe and open with administrator privileges. - -7. Error prompt with no detailed error information - - Open the logs directory, find the newest log file and scroll to the bottom to see the error information. - -8. The large-v3 model is very slow - - If you do not have an Nvidia GPU, or the CUDA environment is not properly configured, or if the VRAM is less than 8GB, do not use this model as it will be very slow and may cause stuttering. - -9. Prompt for missing cublasxx.dll file - - Sometimes an error that "cublasxx.dll does not exist" may be encountered, at which point cuBLAS needs to be downloaded, and then the dll files copied to the system directory. - - [Click to download cuBLAS](https://github.com/jianchang512/stt/releases/download/0.0/cuBLAS_win.7z), then copy the dll files into C:/Windows/System32. - - [cuBLAS.and.cuDNN_win_v4](https://github.com/Purfview/whisper-standalone-win/releases/download/libs/cuBLAS.and.cuDNN_win_v4.7z) - -11. How to use custom timbre - - Go to the settings menu - Custom TTS-API and fill in your tts server interface address. - - A POST request will send application/www-urlencode data to the API address provided: +0. CentOS/RHEL series execute the following commands in order to install python3.12 ``` -# Data sent in the request: - -text: The text/string to be synthesized - -language: The language code of the text (zh-cn, zh-tw, en, ja, ko, ru, de, fr, tr, th, vi, ar, hi, hu, es, pt, it)/string - -voice: The name of the voice actor/character/string - -rate: The value for speeding up or slowing down, 0 or '+' or '%' or '-' followed by a number, representing the percentage of the acceleration or deceleration based on the normal speed/string - -ostype: Operating system type win32 or mac or linux/string -extra: Additional parameters/string +sudo yum update -# Expected JSON format data returned from the interface: -{ - code:0 when synthesis is successful, a number >0 represents failure - msg:ok when synthesis is successful, otherwise it is the reason for failure - data: On successful synthesis, returns the complete URL of the mp3 file, used for downloading within the software. Empty when there is a failure -} +sudo yum groupinstall "Development Tools" +sudo yum install openssl-devel bzip2-devel libffi-devel -``` +cd /tmp +wget https://www.python.org/ftp/python/3.12.0/Python-3.12.0.tgz -13. Subtitles and voice unable to align +tar xzf Python-3.12.0.tgz -> After translation, pronunciation duration varies in different languages, for example, a sentence that is 3s in Chinese might take 5s in English, resulting in inconsistency with the video length. -> -> Two solutions: -> -> 1. Force dubbing to play at a faster speed to shorten the dubbing duration and align with the video. -> -> 2. Force the video to play at a slower pace to extend the video's duration and align with the dubbing. +cd Python-3.12.0 -14. Subtitles not displaying or are garbled +./configure — enable-optimizations -> By using soft synthesized subtitles: subtitles are embedded as separate files into the video and can be extracted again. If supported by the player, subtitles can be enabled or disabled in the player's subtitle management. -> -> Note that many domestic players require the srt subtitle file to be in the same directory as the video with the same name in order to load soft subtitles. It may also be necessary to convert the srt file to GBK encoding to avoid garbled text. +sudo make && sudo make install -15. How to switch software interface language/Chinese or English +sudo alternatives — install /usr/bin/python3 python3 /usr/local/bin/python3.12 2 -Open the `videotrans/set.ini` file in the software directory, and then fill in the language code after `lang=`, where `zh` stands for Chinese and `en` stands for English. Restart the software after making the change. +sudo yum install -y ffmpeg ``` -;The default interface follows the system and can also be specified manually here, zh=Chinese interface, en=English interface. -;默认界面跟随系统,也可以在此手动指定,zh=中文界面,en=英文界面 -lang = - +## Ubuntu/Debian series execute the following commands to install python3.12 ``` -16. Crashes before completion - -If CUDA is enabled and the computer has installed the CUDA environment but cudnn has not been manually installed and configured, you will encounter this issue. Install cudnn that matches with CUDA. For example, if you have installed CUDA 12.3, you will need to download cudnn for cuda12.x package, then unzip it, and copy the three folders inside to the CUDA installation directory. For the specific tutorial, refer to -https://juejin.cn/post/7318704408727519270. +apt update && apt upgrade -y -If cudnn is installed according to the tutorial and still crashes, it is very likely due to insufficient GPU memory. You can switch to using the medium model. When VRAM is less than 8GB, try to avoid using the largev-3 model, especially when the video is larger than 20MB, otherwise, it may crash due to insufficient memory. +apt install software-properties-common -y -17. How to adjust subtitle font size +add-apt-repository ppa:deadsnakes/ppa -If embedding hard subtitles, you can adjust the font size by modifying the `fontsize=0` in `videotrans/set.ini` to a suitable value. 0 represents the default size, and 20 represents a font size of 20 pixels. +apt update -18. Errors on macOS +sudo apt-get install libxcb-cursor0 -OSError: ctypes.util.find_library() did not manage to locate a library called 'sndfile' +apt install python3.12 -Solution: +curl -sS https://bootstrap.pypa.io/get-pip.py | python3.12 -Find the libsndfile installation location. If installed via brew, it's generally located at `/opt/homebrew/Cellar/libsndfile`. Then add this path to the environment variable: `export DYLD_LIBRARY_PATH=/opt/homebrew/Cellar/libsndfile/1.2.2/lib:$DYLD_LIBRARY_PATH`. +pip 23.2.1 from /usr/local/lib/python3.12/site-packages/pip (python 3.12) -19. GPT-SoVITS API does not support mixed Chinese and English pronunciation - -The api.py that comes with GPT-SoVITS does not support mixed Chinese and English pronunciation. If needed, please [click to download this file](https://github.com/jianchang512/stt/releases/download/0.0/GPT-SoVITS.api.py.zip), and replace the existing api.py of GPT-SoVITS with the one from the compressed package. - -20. Are there detailed tutorials? - -There is a documentation website at https://pyvideotrans.com, but since it's inconvenient to upload images there, updates are slow. Please check the Juejin blog for the latest tutorials at https://juejin.cn/user/4441682704623992/columns. - -Or you can follow my WeChat public account, which basically has the same content as the Juejin blog. Search WeChat to view the public account `pyvideotrans`. - -# Advanced Settings - videotrans/set.ini - -**Please do not adjust randomly unless you know what will happen** +sudo update-alternatives --install /usr/bin/python python /usr/local/bin/python3.12 1 +sudo update-alternatives --config python +apt-get install ffmpeg ``` -;#################### -;####################### -;如果你不确定修改后将会带来什么影响,请勿随意修改,修改前请做好备份, 如果出问题请恢复 -;If you are not sure of the impact of the modification, please do not modify it, please make a backup before modification, and restore it if something goes wrong. - -;升级前请做好备份,升级后按照原备份重新修改。请勿直接用备份文件覆盖,因为新版本可能有新增配置 -;Please make a backup before upgrading, and re-modify according to the original backup after upgrading. Please don't overwrite the backup file directly, because the new version may have added - -;The default interface follows the system and can also be specified manually here, zh=Chinese interface, en=English interface. -;默认界面跟随系统,也可以在此手动指定,zh=中文界面,en=英文界面 -lang = - -;Video processing quality, integer 0-51, 0 = lossless processing with large size is very slow, 51 = lowest quality with smallest size is the fastest processing speed -;视频处理质量,0-51的整数,0=无损处理尺寸较大速度很慢,51=质量最低尺寸最小处理速度最快 -crf=13 - -;The number of simultaneous voiceovers, 1-10, it is recommended not to be greater than 5, otherwise it is easy to fail -;同时配音的数量,1-10,建议不要大于5,否则容易失败 -dubbing_thread=2 - -;Maximum audio acceleration, default 0, i.e. no limitation, you need to set a number greater than 1-100, such as 1.5, representing the maximum acceleration of 1.5 times, pay attention to how to set the limit, then the subtitle sound will not be able to be aligned -;音频最大加速倍数,默认0,即不限制,需设置大于1-100的数字,比如1.5,代表最大加速1.5倍,注意如何设置了限制,则字幕声音将无法对齐 -audio_rate=2.5 - -;Maximum permissible slowdown times of the video frequency, default 0, that is, no restriction, you need to set a number greater than 1-20, for example, 1 = on behalf of not slowing down, 20 = down to 1/20 = 0.05 the original speed, pay attention to how to set up the limit, then the subtitles and the screen will not be able to be aligned -;视频频最大允许慢速倍数,默认0,即不限制,需设置大于1-20的数字,比如1=代表不慢速,20=降为1/20=0.05原速度,注意如何设置了限制,则字幕和画面将无法对齐 -video_rate=0 - -;Number of simultaneous translations, 1-20, not too large, otherwise it may trigger the translation api frequency limitation -;同时翻译的数量,1-20,不要太大,否则可能触发翻译api频率限制 -trans_thread=10 - -;Hard subtitles can be set here when the subtitle font size, fill in the integer numbers, such as 12, on behalf of the font size of 12px, 20 on behalf of the size of 20px, 0 is equal to the default size -;硬字幕时可在这里设置字幕字体大小,填写整数数字,比如12,代表字体12px大小,20代表20px大小,0等于默认大小 -fontsize=14 +**Open any terminal, execute `python3 -V`, if it displays “3.12.0”, the installation is successful, otherwise it's a failure.** -;背景声音音量降低或升高幅度,大于1升高,小于1降低 -backaudio_volume=0.5 +1. Create a folder without spaces and Chinese characters, open the folder from the terminal. +3. In the terminal execute the command `git clone https://github.com/jianchang512/pyvideotrans`. +4. Continue with the command `cd pyvideotrans`. +5. Continue with `python -m venv venv`. +6. Continue with the command `source ./venv/bin/activate`, confirming that the terminal prompt starts with `(venv)`. +7. Execute `pip install -r requirements.txt --no-deps`, if there's a failure prompt, switch to Aliyun mirror source and execute the following 2 commands. -;Number of translation error retries -;翻译出错重试次数 -retries=5 + ``` -;chatGPT model list -;可供选择的chatGPT模型,以英文逗号分隔 -chatgpt_model=gpt-3.5-turbo,gpt-4,gpt-4-turbo-preview + pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/ + pip config set install.trusted-host mirrors.aliyun.com -;When separating the background sound, cut the clip, too long audio will exhaust the memory, so cut it and separate it, unit s, default 1800s, i.e. half an hour. -;背景音分离时切分片段,太长的音频会耗尽显存,因此切分后分离,单位s,默认 600s -separate_sec=600 + ``` -;The number of seconds to pause before subtitle recognition is completed and waiting for translation, and the number of seconds to pause after translation and waiting for dubbing. -;字幕识别完成等待翻译前的暂停秒数,和翻译完等待配音的暂停秒数 -countdown_sec=30 + Then retry. If the failure persists, try `pip install -r requirements.txt --ignore-installed --no-deps`. +8. If you want to use CUDA acceleration, execute respectively -;Accelerator cuvid or cuda -;硬件编码设备,cuvid或cuda -hwaccel=cuvid + `pip uninstall -y torch torchaudio` -; Accelerator output format = cuda or nv12 -;硬件输出格式,nv12或cuda -hwaccel_output_format=nv12 -;not decode video before use -c:v h264_cuvid,false=use -c:v h264_cuvid, true=dont use -;Whether to disable hardware decoding, true=disable, good compatibility; false=enable, there may be compatibility errors on some hardware. -;是否禁用硬件解码,true=禁用,兼容性好;false=启用,可能某些硬件上有兼容错误 -no_decode=true + `pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118` -;cuda data type when recognizing subtitles from video, int8 = consumes fewer resources, faster, lower precision, float32 = consumes more resources, slower, higher precision, int8_float16 = device of choice -;从视频中识别字幕时的cuda数据类型,int8=消耗资源少,速度快,精度低,float32=消耗资源多,速度慢,精度高,int8_float16=设备自选 -cuda_com_type=float32 + `pip install nvidia-cublas-cu11 nvidia-cudnn-cu11` -;中文语言的视频时,用于识别的提示词,可解决简体识别为繁体问题。但注意,有可能直接会将提示词作为识别结果返回 -initial_prompt_zh= +9. To enable CUDA acceleration on Linux, you must have an NVIDIA card and have configured the CUDA11.8+ environment properly, see [CUDA acceleration support](https://github.com/jianchang512/pyvideotrans?tab=readme-ov-file#cuda-%E5%8A%A0%E9%80%9F%E6%94%AF%E6%8C%81). -; whisper thread 0 is equal cpu core, -;字幕识别时,cpu进程 -whisper_threads=4 +10. `python sp.py` to open the software interface. -;whisper num_worker -;字幕识别时,同时工作进程 -whisper_worker=1 +# Window10/11 Source Code Deployment -;Subtitle recognition accuracy adjustment, 1-5, 1 = consume the lowest resources, 5 = consume the most, if the video memory is sufficient, can be set to 5, may achieve more accurate recognition results -;字幕识别时精度调整,1-5,1=消耗资源最低,5=消耗最多,如果显存充足,可以设为5,可能会取得更精确的识别结果 -beam_size=5 -best_of=5 +0. Open https://www.python.org/downloads/ and download windows3.12, after downloading, keep clicking next, ensuring to select "Add to PATH". -;Enable custom mute segmentation when in subtitle overall recognition mode, true=enable, can be set to false to disable when video memory is insufficient. -;字幕整体识别模式时启用自定义静音分割片段,true=启用,显存不足时,可以设为false禁用 -vad=true + **Open a cmd, execute `python -V`, if the output is not `3.12.3`, it means there was an installation error, or "Add to PATH" was not selected, please reinstall.** -;0 = less GPU resources but slightly worse results, 1 = more GPU resources and better results -;0=占用更少GPU资源但效果略差,1=占用更多GPU资源同时效果更好 -temperature=1 +1. Open https://github.com/git-for-windows/git/releases/download/v2.45.0.windows.1/Git-2.45.0-64-bit.exe, download git, after downloading keep clicking next. +2. Find a folder that does not contain spaces and Chinese characters, type `cmd` in the address bar and hit enter to open the terminal, all commands are to be executed in this terminal. +3. Execute the command `git clone https://github.com/jianchang512/pyvideotrans`. +4. Continue with the command `cd pyvideotrans`. +5. Continue with `python -m venv venv`. +6. Continue with the command `.\venv\scripts\activate`, ensuring the command line starts with `(venv)`, otherwise, there's an error. +7. If you want to use CUDA acceleration, execute respectively -;Same as temperature, true=better with more GPUs, false=slightly worse with fewer GPUs. -;同 temperature, true=占用更多GPU效果更好,false=占用更少GPU效果略差 -condition_on_previous_text=true + `pip uninstall -y torch torchaudio` -; For pre-split and overall , the minimum silence segment ms to be used as the basis for cutting, default 100ms, i.e., and max seconds. -;用于 预先分割 和 整体识别 时,作为切割依据的最小静音片段ms,默认200ms 以及最大句子时长 -overall_silence=200 -overall_maxsecs=3 + `pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118` -; For equal-division, the minimum silence segment ms to be used as the basis for cutting, default 500ms, i.e., only silence greater than or equal to 500ms will be segmented. -;用于 均等分割时,作为切割依据的最小静音片段ms,默认500ms,即只有大于等于500ms的静音处才分割 -voice_silence=500 +8. To enable CUDA acceleration on Windows, you must have an NVIDIA card and have configured the CUDA11.8+ environment properly, see [CUDA acceleration support](https://github.com/jianchang512/pyvideotrans?tab=readme-ov-file#cuda-%E5%8A%A0%E9%80%9F%E6%94%AF%E6%8C%81). -;Seconds per slice for equal-division, default 10s, i.e. each subtitle is approximately 10s long. -;用于均等分割时的每个切片时长 秒,默认 10s,即每个字幕时长大约都是10s -interval_split=10 +9. Linux If you want to use CUDA acceleration, you need to also install `pip install nvidia-cublas-cu11 nvidia-cudnn-cu11`. -;CJK subtitle number of characters in a line length, more than this will be line feed. -;中日韩字幕一行长度字符个数,多于这个将换行 -cjk_len=30 +10. Unzip ffmpeg.zip to the current source code directory, overwrite if prompted, ensure you can see ffmpeg.exe, ffprobe.exe, ytwin32.exe, in the ffmepg folder within the source code. -;Other language line breaks, more than this number of characters will be a line break. -;其他语言换行长度,多于这个字符数量将换行 -other_len=60 +11. `python sp.py` to open the software interface. -``` - -# CUDA Acceleration Support - -**Install CUDA Tools** [Detailed installation method on Windows](https://juejin.cn/post/7318704408727519270) - -It's essential to have both CUDA and cuDNN installed; otherwise, a crash may occur. - -On Linux, use `pip install nvidia-cublas-cu11 nvidia-cudnn-cu11`. - -After installation, execute `python testcuda.py` or double-click on testcuda.exe. If all outputs are ok, it is confirmed to be working. - -Sometimes you might encounter an error saying "cublasxx.dll does not exist." If this error occurs and your CUDA configuration is correct but persistent recognition errors occur, you need to download cuBLAS and then copy the dll files to the system directory. -[Click to download cuBLAS](https://github.com/jianchang512/stt/releases/download/0.0/cuBLAS_win.7z), then unzip it and copy the dll files to C:/Windows/System32. +# Tutorial and Documentation -# CLI Command Line Mode +Please check https://pyvideotrans.com/guide.html -[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1yDGPWRyXeZ1GWqkOpdJDv4nA_88HNm01?usp=sharing) -cli.py is a command-line execution script, and `python cli.py` is the simplest way to execute it. +# Voice Recognition Models: -The parameters it accepts are: + Download address: https://pyvideotrans.com/model.html -`-m mp4 video absolute address` + Description and differences introduction: https://pyvideotrans.com/02.html -Detailed configuration parameters can be set in the cli.ini located in the same directory as cli.py. Other mp4 video addresses to be processed can also be configured through the command line parameter `-m mp4 video absolute address`, like `python cli.py -m D:/1.mp4`. -Within cli.ini are the complete parameters, with the first parameter `source_mp4` representing the video to be processed. Command-line parameters will take precedence over `source_mp4` if passed using `-m`. +# Video Tutorials (Third-party) -`-c configuration file address` +[MacOS Source Code Deployment/Bilibili](https://www.bilibili.com/video/BV1tK421y7rd/) -You can also copy cli.ini to another location, and then specify the configuration file to use with the `-c cli.ini absolute path address` command-line parameter, for example, `python cli.py -c E:/conf/cli.ini`, which will use the configuration information from that file, ignoring the project directory configuration file. +[How to Set Video Translation Using Gemini Api/Bilibili](https://b23.tv/fED1dS3) -`-cuda` does not need a value to follow; merely adding it signifies the enablement of CUDA acceleration (if available) `python cli.py -cuda`. +[How to Download and Install](https://www.bilibili.com/video/BV1Gr421s7cN/) -Example: `python cli.py -cuda -m D:/1.mp4`. +# Software Preview Screenshots -## Specific Parameters and Descriptions in cli.ini - -``` -;Command Line Arguments -;Absolute address of the video to be processed, use forward slash as path separator, can also be passed after -m in the command line arguments -source_mp4= -;Network proxy address, mandatory for Google chatGPT official China -proxy=http://127.0.0.1:10809 -;Directory for output result files -target_dir= -;Video pronunciation language, choose from here: zh-cn zh-tw en fr de ja ko ru es th it pt vi ar tr -source_language=zh-cn -;Language for speech recognition, no need to fill in -detect_language= -;Language to translate into: zh-cn zh-tw en fr de ja ko ru es th it pt vi ar tr -target_language=en -;Language for soft subtitles embedding, no need to fill in -subtitle_language= -;true=Enable CUDA -cuda=false -;Role name, find openaiTTS role names "alloy, echo, fable, onyx, nova, shimmer" from voice_list.json for corresponding language roles. Find elevenlabsTTS role names from elevenlabs.json -voice_role=en-CA-ClaraNeural -; Dubbing acceleration value, must start with a + or - sign, + means accelerate, - means decelerate, ending with % -voice_rate=+0% -;Options include edgetTTS openaiTTS elevenlabsTTS -tts_type=edgeTTS -;Silent segment, unit in ms -voice_silence=500 -;all=Overall recognition, split=Recognition after pre-splitting sound segments -whisper_type=all -;Options for speech recognition model: base small medium large-v3 -whisper_model=base -;Translation channel, options include google baidu chatGPT Azure Gemini tencent DeepL DeepLX -translate_type=google -;0=Do not embed subtitles, 1=Embed hard subtitles, 2=Embed soft subtitles -subtitle_type=1 -;true=Auto-accelerate dubbing -voice_autorate=false -;true=Auto-slow video -video_autorate=false -;Interface address for deepl translation -deepl_authkey=asdgasg -;Address for configured deeplx service -deeplx_address=http://127.0.0.1:1188 -;Tencent translation id -tencent_SecretId= -;Tencent translation key -tencent_SecretKey= -;Baidu translation id -baidu_appid= -;Baidu translation secret -baidu_miyue= -;Key for elevenlabstts -elevenlabstts_key= -;chatGPT interface address, ending with /v1, third-party interface addresses can be filled in -chatgpt_api= -;Key for chatGPT -chatgpt_key= -;chatGPT model, options include gpt-3.5-turbo gpt-4 -chatgpt_model=gpt-3.5-turbo -;Azure's API interface address -azure_api= -;Key for Azure -azure_key= -;Azure's model name, options include gpt-3.5-turbo gpt-4 -azure_model=gpt-3.5-turbo -;Key for Google Gemini -gemini_key= - -``` - -# Software Preview Screenshot -![image](https://github.com/jianchang512/pyvideotrans/assets/3378335/28cf7079-dc97-4666-abf3-abb030ae2ea2) +![image](https://github.com/jianchang512/pyvideotrans/assets/3378335/e5089358-a6e5-4989-9a50-1876c51dc2a7) # Related Projects -[OTT: Local offline text translation tool](https://github.com/jianchang512/ott) +[OTT: Local Offline Text Translation Tool](https://github.com/jianchang512/ott) -[Clone Voice Tool: Synthesize speech with any timbre](https://github.com/jianchang512/clone-voice) +[Voice Clone Tool: Synthesize Speech with Any Voice Color](https://github.com/jianchang512/clone-voice) -[STT: Local offline voice recognition to text tool](https://github.com/jianchang512/stt) +[Voice Recognition Tool: Local Offline Speech-to-Text Tool](https://github.com/jianchang512/stt) -[Vocal Separate: Vocal and background music separation tool](https://github.com/jianchang512/vocal-separate) +[Vocal Background Music Separator: Vocal and Background Music Separation Tool](https://github.com/jianchang512/vocal-separate) [Improved version of GPT-SoVITS's api.py](https://github.com/jianchang512/gptsovits-api) + ## Acknowledgements -> This program relies mainly on the following open-source projects +> The main open source projects this program relies on: 1. ffmpeg 2. PySide6 @@ -606,3 +264,4 @@ gemini_key= 4. faster-whisper 5. openai-whisper 6. pydub + diff --git a/README_en_v2.md b/README_en_v2.md index 031adef8..c6b6110e 100644 --- a/README_en_v2.md +++ b/README_en_v2.md @@ -1,414 +1,267 @@ -[简体中文](./README.md) / [Donate to the project](./about.md) / [Join Discord](https://discord.gg/TMCM2PfHzQ) +[简体中文](./README.md) / [👑 Donate to this project](./about.md) -# Video translation and dubbing tools - -[Download the Windows pre-compiled version of the exe](https://github.com/jianchang512/pyvideotrans/releases) +# Video Translation and Dubbing Tool > -> It is a video translation dubbing tool that translates a video in one language into a video in a specified language, automatically generates and adds subtitles and dubbing in that language. +> This is a video translation and dubbing tool that can translate a video from one language to a specified language and automatically generate and add subtitles and dubbing in that language. > -> Speech recognition is based on `faster-whisper` an offline model +> The voice recognition supports `faster-whisper`, `openai-whisper`, `GoogleSpeech`, `zh_recogn Ali Chinese voice recognition model`. > -> Text translation support `google|baidu|tencent|chatGPT|Azure|Gemini|DeepL|DeepLX` , +> Text translation supports `Microsoft Translate | Google Translate | Baidu Translate | Tencent Translate | ChatGPT | AzureAI | Gemini | DeepL | DeepLX | Offline translation OTT`, and includes a free ChatGPT API translation interface sponsored by (apiskey.top). > -> Text-to-speech support `Microsoft Edge tts` `Openai TTS-1` `Elevenlabs TTS` +> Text-to-speech synthesis supports `Microsoft Edge tts`, `Google tts`, `Azure AI TTS`, `Openai TTS`, `Elevenlabs TTS`, `Custom TTS server API`, `GPT-SoVITS`, [clone-voice](https://github.com/jianchang512/clone-voice). > +> Allows retaining background accompaniment music, etc. (based on uvr5) +> +> Supported languages: Simplified and Traditional Chinese, English, Korean, Japanese, Russian, French, German, Italian, Spanish, Portuguese, Vietnamese, Thai, Arabic, Turkish, Hungarian, Hindi, Ukrainian, Kazakh, Indonesian, Malay. -# Primary Uses and Usage - -【Translate Video and Dub】Set each option as needed, freely configure the combination, and realize translation and dubbing, automatic acceleration and deceleration, merging, etc - -[Extract Subtitles Without Translation] Select a video file and select the source language of the video, then the text will be recognized from the video and the subtitle file will be automatically exported to the destination folder - -【Extract Subtitles and Translate】Select the video file, select the source language of the video, and set the target language to be translated, then the text will be recognized from the video and translated into the target language, and then the bilingual subtitle file will be exported to the destination folder +# Main Uses and How to Use -[Subtitle and Video Merge] Select the video, then drag and drop the existing subtitle file to the subtitle area on the right, set the source language and target language to the language of the subtitles, and then select the dubbing type and role to start the execution +【Translate videos and dub】Translate the audio in videos into the voice of another language and embed subtitles in that language -【Create Dubbing for Subtitles】Drag and drop the local subtitle file to the subtitle editor on the right, then select the target language, dubbing type, and role, and transfer the generated dubbed audio file to the destination folder +【Convert audio or video to subtitles】Recognize human speech in audio or video files as text and export to srt subtitle files -[Audio and Video Text Recognition] Drag the video or audio to the recognition window, and the text will be recognized and exported to SRT subtitle format +【Batch subtitle creation and dubbing】Create dubbing based on local existing srt subtitle files, supporting single or batch subtitles -[Synthesize Text into Speech] Generate a voiceover from a piece of text or subtitle using a specific dubbing role +【Batch subtitle translation】Translate one or more srt subtitle files into subtitles in other languages -Separate Audio from Video Separates video files into audio files and silent videos +【Audio, video, and subtitle merge】Merge audio files, video files, and subtitle files into one video file -【Audio and Video Subtitle Merge】Merge audio files, video files, and subtitle files into one video file +【Extract audio from video】Extract as audio files and mute video from video -【Audio and Video Format Conversion】Conversion between various formats +【Audio and video format conversion】Mutual conversion between various formats -【Subtitle Translation】Translate text or SRT subtitle files into other languages +【Download YouTube videos】Download videos from YouTube ---- +https://github.com/jianchang512/pyvideotrans/assets/3378335/3811217a-26c8-4084-ba24-7a95d2e13d58 +# Pre-Packaged Version (Only available for Win10/Win11, MacOS/Linux systems use source code deployment) +> Packaged with pyinstaller, not anti-virus whitelisted or signed, anti-virus software may flag it, please add to trusted list or use source code deployment -https://github.com/jianchang512/pyvideotrans/assets/3378335/c3d193c8-f680-45e2-8019-3069aeb66e01 +0. [Click to download the pre-packaged version, unzip to an English directory without spaces, then double-click sp.exe (https://github.com/jianchang512/pyvideotrans/releases) +1. Unzip to an English path, and ensure the path does not contain spaces. After unzipping, double-click sp.exe (if you encounter permission issues, right-click to open as administrator) +4. Note: Must be used after extracting, cannot be used directly within the compressed package, nor can the sp.exe file be moved to another location after extraction -# Use win to precompile the exe version (other systems use source code for deployment) -0. [Click Download to download the pre-compiled version](https://github.com/jianchang512/pyvideotrans/releases) +# MacOS Source Code Deployment -1. It is recommended to decompress the data to the English path and the path does not contain spaces. After decompression, double-click sp.exe (if you encounter permission problems, you can right-click to open it with administrator permissions) +0. Open a terminal window and execute these 3 commands respectively -3. If no killing is done, domestic killing software may have false positives, which can be ignored or deployed using source code + ``` + brew install libsndfile + brew install ffmpeg -# Source code deployment + brew install git -[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1yDGPWRyXeZ1GWqkOpdJDv4nA_88HNm01?usp=sharing) + brew install python@3.12 + ``` -1. Configure the Python 3.9->3.11 environment -2. `git clone https://github.com/jianchang512/pyvideotrans` -3. `cd pyvideotrans` -4. `python -m venv venv` -5. Run under win `%cd%/venv/scripts/activate`, Linux and Mac `source ./venv/bin/activate` -6. `pip install -r requirements.txt`If you encounter a version conflict, use `pip install -r requirements.txt --no-deps` (CUDA is not supported on MacOS, replace requirements.txt with requirements-mac.txt on Mac). -7. Extract ffmpeg.zip to the root directory (ffmpeg .exe file), Linux and Mac Please install ffmpeg by yourself, the specific method can be "Baidu or Google" -8. `python sp.py` Open the software interface -9. If you need to support CUDA acceleration, you need to have an NVIDIA graphics card on the device, see CUDA [Acceleration Support below for specific installation precautions](https://github.com/jianchang512/pyvideotrans?tab=readme-ov-file#cuda-%E5%8A%A0%E9%80%9F%E6%94%AF%E6%8C%81) + Then proceed with the following 2 commands + ``` + export PATH="/usr/local/opt/python@3.12/bin:$PATH" -# How to use: + source ~/.bash_profile; source ~/.zshrc -1. Original video: Select MP4/AVI/MOV/MKV/MPEG video, you can select multiple videos; + ``` -2. Output Video Directory: If you do not select this option, it will be generated in the same directory by default, `_video_out` and two subtitle files in the original language and target language will be created in the SRT folder in this directory +1. Create a folder without spaces and Chinese characters, and enter that folder in the terminal. +2. Execute the command `git clone https://github.com/jianchang512/pyvideotrans` in the terminal. +3. Execute the command `cd pyvideotrans`. +4. Continue with `python -m venv venv`. +5. Continue with the command `source ./venv/bin/activate`, confirming that the terminal prompt starts with `(venv)`, the following commands must be sure the terminal prompt starts with `(venv)`. +6. Execute `pip install -r requirements.txt --no-deps`, if there's a failure prompt, switch to Aliyun mirror source and execute the following 2 commands -3. Select a translation: Select google|baidu|tencent|chatGPT|Azure|Gemini|DeepL|DeepLX translation channel + ``` + pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/ + pip config set install.trusted-host mirrors.aliyun.com + ``` -4. Web proxy address: If you can't directly access google/chatGPT in your region, you need to set up a proxy in the software interface Web Proxy, for example, if you use v2ray, fill `http://127.0.0.1:10809` in . If you `http://127.0.0.1:7890`have modified the default port or other proxy software you are using, fill in the information as needed + Then retry. If the failure persists, try `pip install -r requirements.txt --ignore-installed --no-deps`. -5. Original Language: Select the language of the video to be translated +7. `python sp.py` to open the software interface. -6. Target Language: Select the language you want to translate to +8. Ubuntu may also need to install the Libxcb library, installation commands are: -7. Select Dubbing: After selecting the target language for translation, you can select the dubbing role from the Dubbing options. - - Hard subtitles: Refers to always displaying subtitles, which cannot be hidden, if you want to have subtitles when playing in the web page, please select hard subtitle embedding + ``` - Soft subtitles: If the player supports subtitle management, you can display or close subtitles, but subtitles will not be displayed when playing in the web page, some domestic players may not support it, and you need to put the generated video with the same name srt file and video in a directory to display + sudo apt-get update + sudo apt-get install libxcb-cursor0 + ``` -8. Speech recognition model: Select base/small/medium/large-v3, the recognition effect is getting better and better, but the recognition speed is getting slower and slower, and the memory required is getting larger and larger, the built-in base model, please download other models separately, unzip and put them in the `当前软件目录/models`directory +[Detailed MacOS deployment scheme](https://pyvideotrans.com/mac.html) - Overall recognition/pre-segmentation: Integral recognition refers to sending the entire voice file directly to the model, which is processed by the model, and the segmentation may be more accurate, but it may also create a single subtitle with a length of 30s, which is suitable for audio with clear mute; Pre-segmentation means that the audio is cut to a length of about 10 seconds and then sent to the model for processing. +# Linux Source Code Deployment - [All models are available for download](https://github.com/jianchang512/stt/releases/tag/0.0) - - After downloading, decompress and copy the models--systran--faster-whisper-xx folder in the compressed package to the models directory +0. CentOS/RHEL series execute the following commands in order to install python3.12 - ![](https://github.com/jianchang512/stt/assets/3378335/5c972f7b-b0bf-4732-a6f1-253f42c45087) - +``` - [FFmepg download (the compiled version comes with it).](https://www.ffmpeg.org/) +sudo yum update -9. Dubbing speaking rate: fill in the number between -90 and +90, the same sentence in different languages, the time required is different, so the sound and picture subtitles may not be synchronized after dubbing, you can adjust the speaking speed here, negative numbers represent slow speed, positive numbers represent accelerated playback. +sudo yum groupinstall "Development Tools" -10. Audio and video alignment: "Automatic acceleration of voiceover" and "automatic slowing of video" respectively +sudo yum install openssl-devel bzip2-devel libffi-devel -> -> After translation, different languages have different pronunciation durations, such as a sentence in Chinese 3s, translated into English may be 5s, resulting in inconsistent duration and video. -> -> There are 2 ways to solve it: -> -> 1. Force voiceovers to speed up playback to shorten voiceover duration and video alignment -> -> 2. Force the video to play slowly so that the video length is longer and the voice over is aligned. -> -> You can choose only one of the two -> - - -11. Mute Clip: Enter a number from 100 to 2000, representing milliseconds, and the default is 500, that is, the muted segment greater than or equal to 500ms is used as the interval to split the voice +cd /tmp -12. **CUDA Acceleration**: Confirm that your computer's graphics card is an N card, and the CUDA environment and driver have been configured, then enable this option, the speed can be greatly improved, and the specific configuration method is shown in the[ CUDA acceleration support below](https://github.com/jianchang512/pyvideotrans?tab=readme-ov-file#cuda-%E5%8A%A0%E9%80%9F%E6%94%AF%E6%8C%81) +wget https://www.python.org/ftp/python/3.12.0/Python-3.12.0.tgz -13. TTS: You can use edgeTTS and openai TTS models to select the characters you want to synthesize voice, and openai needs to use the official interface or open the third-party interface of the TTS-1 model +tar xzf Python-3.12.0.tgz -14. Click the Start button at the bottom to display the current progress and logs, and the subtitles will be displayed in the text box on the right +cd Python-3.12.0 -15. After the subtitle parsing is completed, it will pause and wait for the subtitle to be modified, and if you don't do anything, it will automatically move on to the next step after 60s. You can also edit the subtitles in the subtitle area on the right, and then manually click to continue compositing +./configure — enable-optimizations -16. In the subdirectory of the video with the same name in the destination folder, the subtitle SRT file of the two languages, the original voice and the dubbed WAV file will be generated respectively to facilitate further processing +sudo make && sudo make install -17. Set line role: You can set the pronunciation role for each line in the subtitles, first select the TTS type and role on the left, and then click "Set Line Role" at the bottom right of the subtitle area, and fill in the line number you want to use the role dubbing in the text behind each character name, as shown in the following figure:![](./images/p2.png) - -# Advanced settings videotrans/set.ini +sudo alternatives — install /usr/bin/python3 python3 /usr/local/bin/python3.12 2 -**Don't adjust it unless you know what's going to happen** +sudo yum install -y ffmpeg ``` -; Set the software interface language, where en represents English and zh represents Chinese -lang= - -; Number of simultaneous voice acting threads -dubbing_ Thread=5 - -; Simultaneous translation lines -trans_thread=10 -; Software waiting for subtitle modification countdown -countdown_sec=30 +## Ubuntu/Debian series execute the following commands to install python3.12 -; Accelerate device cuvid or cuda -hwaccel=cuvid - -; Accelerate device output format, nv12 or cuda -hwaccel_output_format=nv12 - -; Is hardware decoding used - c: v h264_ Cuvid true represents yes, false represents no -no_code=false - -; In speech recognition, the data format is int8, float16, or float32 -cuda_com_type=int8 - -; The number of speech recognition threads, where 0 represents the same number of CPU cores. If it occupies too much CPU, it can be changed to 4 here -whisper_threads=0 - -; Number of speech recognition work processes -whisper_worker=2 - -;Reducing these two numbers will use less graphics memory -beam_size=5 -best_of=5 - -;Simultaneous execution quantity in pre split mode -split_threads=4 ``` +apt update && apt upgrade -y +apt install software-properties-common -y -# CUDA acceleration support - -**Install the CUDA tool** [for detailed installation methods](https://juejin.cn/post/7318704408727519270) - -After installing CUDA, if there is a problem, perform `pip uninstall torch torchaudio torchvision` Uninstall[, then go to https://pytorch.org/get-started/locally/]() according to your OS type and CUDA version, select `pip3` the command, change to , `pip`and then copy the command to execute. - -After the installation is complete, execute If the `python testcuda.py` output is True, it is available +add-apt-repository ppa:deadsnakes/ppa -Sometimes you get the error "cublasxx .dll does not exist", or you don't get this error, and the CUDA configuration is correct, but the recognition error always appears, you need to download cuBLAS and then copy the dll file to the system directory +apt update -[Click to download cuBLAS, ](https://github.com/jianchang512/stt/releases/download/0.0/cuBLAS_win.7z)unzip it, and copy the dll file to C:/Windows/System32 +sudo apt-get install libxcb-cursor0 +apt install python3.12 -# frequently asked questions +curl -sS https://bootstrap.pypa.io/get-pip.py | python3.12 -1. Using Google Translate, it says error +pip 23.2.1 from /usr/local/lib/python3.12/site-packages/pip (python 3.12) - To use the official interface of google or chatGPT in China, you need to hang a ladder +sudo update-alternatives --install /usr/bin/python python /usr/local/bin/python3.12 1 -2. A global proxy has been used, but it doesn't look like it's going to be a proxy +sudo update-alternatives --config python - You need to set a specific proxy address in the software interface "Network Proxy", such as http://127.0.0.1:7890 +apt-get install ffmpeg -3. Tip: FFmepg does not exist - - First check to make sure that there are ffmpeg.exe, ffprobe.exe files in the root directory of the software, if they do not exist, unzip ffmpeg.7z, and put these 2 files in the root directory of the software - -4. CUDA is enabled on windows, but an error is displayed - - A: [First of all, check the detailed installation method, ](https://juejin.cn/post/7318704408727519270)make sure you have installed the cuda related tools correctly, if there are still errors[, click to download cuBLAS](https://github.com/jianchang512/stt/releases/download/0.0/cuBLAS_win.7z), unzip and copy the dll file inside to C:/Windows/System32 - - B: If you are sure that it has nothing to do with A, then please check whether the video is H264 encoded mp4, some HD videos are H265 encoded, this is not supported, you can try to convert to H264 video in the "Video Toolbox". - - C: Hardware decoding and encoding of video under GPU requires strict data correctness, and the fault tolerance rate is almost 0, any little error will lead to failure, plus the differences between different versions of graphics card model, driver version, CUDA version, ffmpeg version, etc., resulting in compatibility errors are easy to occur. At present, the fallback is added, and the CPU software is automatically used to encode and decode after failure on the GPU. When a failure occurs, an error message is recorded in the logs directory. - -5. Prompts that the model does not exist - - After version 0.985, models need to be reinstalled, and the models directory is a folder for each model, not a pt file. - To use the base model, make sure that the models/models--Systran--faster-whisper-base folder exists, if it doesn't exist, you need to download it and copy the folder to the models. - If you want to use a small model, you need to make sure that the models/models--Systran--faster-whisper-small folder exists, if it doesn't exist, you need to download it and copy the folder to models. - To use the medium model, make sure that the models/models--Systran--faster-whisper-medium folder exists, if it doesn't exist, you need to download it and copy the folder to the models. - To use the large-v3 model, make sure that the models/models--Systran--faster-whisper-large-v3 folder exists, if it doesn't, you need to download it and copy the folder to the models. - - [All models are available for download](https://github.com/jianchang512/stt/releases/tag/0.0) - -6. The directory does not exist or the permission is incorrect +``` - Right-click on sp..exe to open with administrator privileges +**Open any terminal, execute `python3 -V`, if it displays “3.12.0”, the installation is successful, otherwise it's a failure.** -7. An error is prompted, but there is no detailed error information +1. Create a folder without spaces and Chinese characters, open the folder from the terminal. +3. In the terminal execute the command `git clone https://github.com/jianchang512/pyvideotrans`. +4. Continue with the command `cd pyvideotrans`. +5. Continue with `python -m venv venv`. +6. Continue with the command `source ./venv/bin/activate`, confirming that the terminal prompt starts with `(venv)`. +7. Execute `pip install -r requirements.txt --no-deps`, if there's a failure prompt, switch to Aliyun mirror source and execute the following 2 commands. - Open the logs directory, find the latest log file, and scroll to the bottom to see the error message. + ``` -8. The large-v3 model is very slow + pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/ + pip config set install.trusted-host mirrors.aliyun.com - If you don't have an N-card GPU, or you don't have a CUDA environment configured, or the video memory is lower than 4G, please don't use this model, otherwise it will be very slow and stuttering + ``` -9. The cublasxx .dll file is missing + Then retry. If the failure persists, try `pip install -r requirements.txt --ignore-installed --no-deps`. +8. If you want to use CUDA acceleration, execute respectively - Sometimes you encounter the error "cublasxx .dll does not exist", you need to download cuBLAS and copy the dll file to the system directory + `pip uninstall -y torch torchaudio` - [Click to download cuBLAS, ](https://github.com/jianchang512/stt/releases/download/0.0/cuBLAS_win.7z)unzip it, and copy the dll file to C:/Windows/System32 + `pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118` -10. Background music is lost. + `pip install nvidia-cublas-cu11 nvidia-cudnn-cu11` - Only human voices are recognized and saved, so there will be no original background music in the dubbed audio. If you need to retain it, you can use the [voice-background music separation project](https://github.com/jianchang512/vocal-separate) to extract the background music, and then merge it with the dubbed file. +9. To enable CUDA acceleration on Linux, you must have an NVIDIA card and have configured the CUDA11.8+ environment properly, see [CUDA acceleration support](https://github.com/jianchang512/pyvideotrans?tab=readme-ov-file#cuda-%E5%8A%A0%E9%80%9F%E6%94%AF%E6%8C%81). -11. How to use custom voice. +10. `python sp.py` to open the software interface. - Currently, this feature is not supported. If needed, you can first recognize the subtitles, and then use another [voice cloning project](https://github.com/jiangchang512/clone-voice), enter the subtitle srt file, select the customized voice to synthesize the audio file, and then recreate the video. - -13. Captions can't be aligned in speech +# Window10/11 Source Code Deployment -> After translation, different languages have different pronunciation durations, such as a sentence in Chinese 3s, translated into English may be 5s, resulting in inconsistent duration and video. -> -> There are 2 ways to solve it: -> -> 1. Force voiceovers to speed up playback to shorten voiceover duration and video alignment -> -> 2. Force the video to play slowly so that the video length is longer and the voice over is aligned. -> -> You can choose only one of the two - +0. Open https://www.python.org/downloads/ and download windows3.12, after downloading, keep clicking next, ensuring to select "Add to PATH". -14. Subtitles do not appear or display garbled characters + **Open a cmd, execute `python -V`, if the output is not `3.12.3`, it means there was an installation error, or "Add to PATH" was not selected, please reinstall.** -> -> Soft composite subtitles: subtitles are embedded in the video as a separate file, which can be extracted again, and if the player supports it, subtitles can be enabled or disabled in the player's subtitle management; -> -> Note that many domestic players must put the srt subtitle file and the video in the same directory and the same name in order to load the soft subtitles, and you may need to convert the srt file to GBK encoding, otherwise it will display garbled characters. -> +1. Open https://github.com/git-for-windows/git/releases/download/v2.45.0.windows.1/Git-2.45.0-64-bit.exe, download git, after downloading keep clicking next. +2. Find a folder that does not contain spaces and Chinese characters, type `cmd` in the address bar and hit enter to open the terminal, all commands are to be executed in this terminal. +3. Execute the command `git clone https://github.com/jianchang512/pyvideotrans`. +4. Continue with the command `cd pyvideotrans`. +5. Continue with `python -m venv venv`. +6. Continue with the command `.\venv\scripts\activate`, ensuring the command line starts with `(venv)`, otherwise, there's an error. +7. If you want to use CUDA acceleration, execute respectively -15. How to switch the software interface language/Chinese or English + `pip uninstall -y torch torchaudio` -If the set.ini file does not exist in the software directory `videtrans/set.ini`, `lang=`then fill in the language code`zh`, representing Chinese, `en`representing English, and then restart the software + `pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118` -``` -[GUI] -;GUI show language ,set en or zh eg. lang=en -lang = +8. To enable CUDA acceleration on Windows, you must have an NVIDIA card and have configured the CUDA11.8+ environment properly, see [CUDA acceleration support](https://github.com/jianchang512/pyvideotrans?tab=readme-ov-file#cuda-%E5%8A%A0%E9%80%9F%E6%94%AF%E6%8C%81). -``` +9. Linux If you want to use CUDA acceleration, you need to also install `pip install nvidia-cublas-cu11 nvidia-cudnn-cu11`. -# CLI command-line mode +10. Unzip ffmpeg.zip to the current source code directory, overwrite if prompted, ensure you can see ffmpeg.exe, ffprobe.exe, ytwin32.exe, in the ffmepg folder within the source code. -[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1yDGPWRyXeZ1GWqkOpdJDv4nA_88HNm01?usp=sharing) +11. `python sp.py` to open the software interface. -cli.py is a command-line execution script and`python cli.py` is the easiest way to execute it +# Tutorial and Documentation -Parameters received: +Please check https://pyvideotrans.com/guide.html -`-m mp4` -The specific configuration parameters can be configured in the CLI.ini located in the same directory as cli.py, and other MP4 video addresses to be processed can also be configured by command-line parameters `-m mp4视频绝对地址` , such as `python cli.py -m D:/1.mp4`. +# Voice Recognition Models: -cli.ini is the complete parameters, the first parameter `source_mp4`represents the video to be processed, if the command line passes parameters through -m, then use the command line argument, otherwise use this`source_mp4` + Download address: https://pyvideotrans.com/model.html -`-c cli.ini file path` + Description and differences introduction: https://pyvideotrans.com/02.html -You can also copy cli.ini to another location `-c cli.ini的绝对路径地址` and specify the configuration file to use from the command line , for example, `python cli.py -c E:/conf/cli.ini` it will use the configuration information in the file and ignore the configuration file in the project directory. -`-cuda`There is no need to follow the value, just add it to enable CUDA acceleration (if available) `python cli.py -cuda` +# Video Tutorials (Third-party) -Example:`python cli.py -cuda -m D:/1.mp4` +[MacOS Source Code Deployment/Bilibili](https://www.bilibili.com/video/BV1tK421y7rd/) -## Specific parameters and descriptions in cli.ini +[How to Set Video Translation Using Gemini Api/Bilibili](https://b23.tv/fED1dS3) -``` -;Command line parameters -;Absolute address of the video to be processed. Forward slash as a path separator, can also be passed after -m in command line parameters -source_mp4= -;Network proxy address, google chatGPT official China needs to be filled in -proxy=http://127.0.0.1:10809 -;Output result file to directory -target_dir= -;Video speech language, select from here zh-cn zh-tw en fr de ja ko ru es th it pt vi ar tr -source_language=zh-cn -;Speech recognition language, no need to fill in -detect_language= -;Language to translate to zh-cn zh-tw en fr de ja ko ru es th it pt vi ar tr -target_language=en -;Language when embedding soft subtitles, no need to fill in -subtitle_language= -;true=Enable CUDA -cuda=false -;Role name, role names of openaiTTS "alloy,echo,fable,onyx,nova,shimmer", role names of edgeTTS can be found in the corresponding language roles in voice_list.json. Role names of elevenlabsTTS can be found in elevenlabs.json -voice_role=en-CA-ClaraNeural -; Dubbing acceleration value, must start with + or -, + means acceleration, - means deceleration, ends with % -voice_rate=+0% -;Optional edgetTTS openaiTTS elevenlabsTTS -tts_type=edgeTTS -;Silent segment, unit ms -voice_silence=500 -;all=whole recognition, split=pre-split sound segment recognition -whisper_type=all -;Speech recognition model optional, base small medium large-v3 -whisper_model=base -;Translation channel, optional google baidu chatGPT Azure Gemini tencent DeepL DeepLX -translate_type=google -;0=Do not embed subtitles, 1=Embed hard subtitles, 2=Embed soft subtitles -subtitle_type=1 -;true=Automatic dubbing acceleration -voice_autorate=false -;true=Automatic video slowdown -video_autorate=false -;deepl translation interface address -deepl_authkey=asdgasg -;Interface address of own configured deeplx service -deeplx_address=http://127.0.0.1:1188 -;Tencent translation id -tencent_SecretId= -;Tencent translation key -tencent_SecretKey= -;Baidu translation id -baidu_appid= -;Baidu translation key -baidu_miyue= -; key of elevenlabstts -elevenlabstts_key= -;chatgpt api, ending with /v1, third party interface address can be filled in -chatgpt_api= -;key of chatGPT -chatgpt_key= -;chatgpt model, optional gpt-3.5-turbo gpt-4 -chatgpt_model=gpt-3.5-turbo -; Azure's api interface address -azure_api= -;key of Azure -azure_key= -; Azure model name, optional gpt-3.5-turbo gpt-4 -azure_model=gpt-3.5-turbo -;key of google Gemini -gemini_key= +[How to Download and Install](https://www.bilibili.com/video/BV1Gr421s7cN/) -``` -# Screenshot of the software preview +# Software Preview Screenshots -![](./images/p1.png?c) +![image](https://github.com/jianchang512/pyvideotrans/assets/3378335/e5089358-a6e5-4989-9a50-1876c51dc2a7) -[Youtube demo](https://youtu.be/-S7jptiDdtc) -# Video Tutorials (Third Party) - -[Deploy the source code on Mac/B station](https://b23.tv/RFiTmlA) +# Related Projects -[Use Gemini API to set up a method/b station for video translation](https://b23.tv/fED1dS3) +[OTT: Local Offline Text Translation Tool](https://github.com/jianchang512/ott) +[Voice Clone Tool: Synthesize Speech with Any Voice Color](https://github.com/jianchang512/clone-voice) -# Related Projects +[Voice Recognition Tool: Local Offline Speech-to-Text Tool](https://github.com/jianchang512/stt) -[Voice Cloning Tool: Synthesize voices with arbitrary timbres](https://github.com/jianchang512/clone-voice) +[Vocal Background Music Separator: Vocal and Background Music Separation Tool](https://github.com/jianchang512/vocal-separate) -[Speech recognition tool: A local offline speech recognition to text tool](https://github.com/jianchang512/stt) +[Improved version of GPT-SoVITS's api.py](https://github.com/jianchang512/gptsovits-api) -[Vocal and background music separation: A minimalist tool for separating vocals and background music, localized web page operations](https://github.com/jianchang512/stt) -## Thanks +## Acknowledgements -> This program mainly relies on some open source projects +> The main open source projects this program relies on: 1. ffmpeg -2. PyQt5 +2. PySide6 3. edge-tts 4. faster-whisper - +5. openai-whisper +6. pydub diff --git a/requirements-linux-gpu.txt b/requirements-linux-gpu.txt deleted file mode 100644 index bbdb3ead..00000000 --- a/requirements-linux-gpu.txt +++ /dev/null @@ -1,163 +0,0 @@ -absl-py==2.0.0 -aiohttp==3.8.6 -aiosignal==1.3.1 -altgraph==0.17.4 -annotated-types==0.6.0 -anyio==3.7.1 -asttokens==2.4.1 -astunparse==1.6.3 -async-timeout==4.0.3 -attrs==23.1.0 -audioread==3.0.1 -av==11.0.0 -azure-cognitiveservices-speech==1.37.0 -cachetools==5.3.2 -certifi==2023.7.22 -cffi==1.16.0 -chardet==3.0.4 -charset-normalizer==3.2.0 -click==7.1.2 -colorama==0.4.6 -coloredlogs==15.0.1 -contourpy==1.1.1 -ctranslate2==4.1.0 -cycler==0.12.1 -decorator==4.4.2 -deepl==1.16.1 -distro==1.8.0 -edge-tts==6.1.8 -elevenlabs==0.2.27 -exceptiongroup==1.1.3 -executing==2.0.1 -faster-whisper==1.0.1 -ffmpeg-python==0.2.0 -filelock==3.12.4 -flatbuffers==1.12 -fonttools==4.43.1 -frozenlist==1.4.0 -fsspec==2023.10.0 -future==0.18.3 -gast==0.4.0 -google-ai-generativelanguage==0.4.0 -google-api-core==2.15.0 -google-auth==2.23.3 -google-auth-oauthlib==0.4.6 -google-generativeai==0.3.1 -google-pasta==0.2.0 -googleapis-common-protos==1.62.0 -grpcio==1.60.0 -grpcio-status==1.60.0 -gTTS==2.5.1 -h11==0.14.0 -h2==3.2.0 -h5py==3.10.0 -hpack==3.0.0 -hstspreload==2023.1.1 -httpcore==1.0.2 -httpx==0.25.1 -huggingface-hub==0.17.3 -humanfriendly==10.0 -hyperframe==5.2.0 -idna==2.10 -imageio==2.31.4 -imageio-ffmpeg==0.4.9 -ipython==8.23.0 -jedi==0.19.1 -Jinja2==3.1.2 -joblib==1.3.2 -keras==2.9.0 -Keras-Preprocessing==1.1.2 -kiwisolver==1.4.5 -lazy_loader==0.3 -libclang==16.0.6 -librosa==0.10.1 -llvmlite==0.41.1 -Markdown==3.5 -MarkupSafe==2.1.3 -matplotlib-inline==0.1.6 -more-itertools==10.1.0 -moviepy==1.0.3 -mpmath==1.3.0 -msgpack==1.0.7 -multidict==6.0.4 -networkx==3.2 -norbert==0.2.1 -numba==0.58.1 -numpy==1.26.0 -oauthlib==3.2.2 -onnxruntime==1.16.1 -openai==1.2.3 -openai-whisper==20231117 -opt-einsum==3.3.0 -ordered-set==4.1.0 -packaging==23.1 -pandas==1.5.3 -parso==0.8.3 -pefile==2023.2.7 -Pillow==10.0.1 -platformdirs==3.11.0 -plyer==2.1.0 -pooch==1.7.0 -proglog==0.1.10 -prompt-toolkit==3.0.43 -proto-plus==1.23.0 -protobuf==4.25.1 -pure-eval==0.2.2 -pyasn1==0.5.0 -pyasn1-modules==0.3.0 -pycparser==2.21 -pydantic==2.4.2 -pydantic_core==2.10.1 -pydub==0.25.1 -pygame==2.5.2 -Pygments==2.17.2 -pyinstaller==6.6.0 -pyinstaller-hooks-contrib==2024.4 -pyparsing==3.1.1 -pyreadline3==3.4.1 -PySide6==6.7.0 -PySide6_Addons==6.7.0 -PySide6_Essentials==6.7.0 -PySoundFile==0.9.0.post1 -python-dateutil==2.8.2 -pytz==2023.3.post1 -pywin32-ctypes==0.2.2 -pywinstyles==1.4 -PyYAML==6.0.1 -QDarkStyle==3.2.3 -QtPy==2.4.1 -regex==2023.10.3 -requests==2.31.0 -requests-oauthlib==1.3.1 -resampy==0.4.2 -rfc3986==1.5.0 -rsa==4.9 -samplerate==0.2.1 -scikit-learn==1.3.2 -scipy==1.11.3 -shiboken6==6.7.0 -six==1.16.0 -sniffio==1.3.0 -soundfile==0.12.1 -soxr==0.3.7 -SpeechRecognition==3.10.0 -srt==3.5.2 -stack-data==0.6.3 -sympy==1.12 -tencentcloud-sdk-python-common==3.0.1032 -tencentcloud-sdk-python-tmt==3.0.1032 -termcolor==2.3.0 -threadpoolctl==3.2.0 -tiktoken==0.3.3 -tokenizers==0.14.1 -tqdm==4.66.1 -traitlets==5.14.1 -typer==0.3.2 -typing_extensions==4.8.0 -urllib3==2.0.5 -wcwidth==0.2.12 -websockets==12.0 -Werkzeug==3.0.1 -wrapt==1.15.0 -yarl==1.9.2 -torch==2.1.2 @ https://download.pytorch.org/whl/cu118/torch-2.1.2%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=60396358193f238888540f4a38d78485f161e28ec17fa445f0373b5350ef21f0 diff --git a/requirements-win-gpu.txt b/requirements-win-gpu.txt deleted file mode 100644 index 182d56fa..00000000 --- a/requirements-win-gpu.txt +++ /dev/null @@ -1,163 +0,0 @@ -absl-py==2.0.0 -aiohttp==3.8.6 -aiosignal==1.3.1 -altgraph==0.17.4 -annotated-types==0.6.0 -anyio==3.7.1 -asttokens==2.4.1 -astunparse==1.6.3 -async-timeout==4.0.3 -attrs==23.1.0 -audioread==3.0.1 -av==11.0.0 -azure-cognitiveservices-speech==1.37.0 -cachetools==5.3.2 -certifi==2023.7.22 -cffi==1.16.0 -chardet==3.0.4 -charset-normalizer==3.2.0 -click==7.1.2 -colorama==0.4.6 -coloredlogs==15.0.1 -contourpy==1.1.1 -ctranslate2==4.1.0 -cycler==0.12.1 -decorator==4.4.2 -deepl==1.16.1 -distro==1.8.0 -edge-tts==6.1.8 -elevenlabs==0.2.27 -exceptiongroup==1.1.3 -executing==2.0.1 -faster-whisper==1.0.1 -ffmpeg-python==0.2.0 -filelock==3.12.4 -flatbuffers==1.12 -fonttools==4.43.1 -frozenlist==1.4.0 -fsspec==2023.10.0 -future==0.18.3 -gast==0.4.0 -google-ai-generativelanguage==0.4.0 -google-api-core==2.15.0 -google-auth==2.23.3 -google-auth-oauthlib==0.4.6 -google-generativeai==0.3.1 -google-pasta==0.2.0 -googleapis-common-protos==1.62.0 -grpcio==1.60.0 -grpcio-status==1.60.0 -gTTS==2.5.1 -h11==0.14.0 -h2==3.2.0 -h5py==3.10.0 -hpack==3.0.0 -hstspreload==2023.1.1 -httpcore==1.0.2 -httpx==0.25.1 -huggingface-hub==0.17.3 -humanfriendly==10.0 -hyperframe==5.2.0 -idna==2.10 -imageio==2.31.4 -imageio-ffmpeg==0.4.9 -ipython==8.23.0 -jedi==0.19.1 -Jinja2==3.1.2 -joblib==1.3.2 -keras==2.9.0 -Keras-Preprocessing==1.1.2 -kiwisolver==1.4.5 -lazy_loader==0.3 -libclang==16.0.6 -librosa==0.10.1 -llvmlite==0.41.1 -Markdown==3.5 -MarkupSafe==2.1.3 -matplotlib-inline==0.1.6 -more-itertools==10.1.0 -moviepy==1.0.3 -mpmath==1.3.0 -msgpack==1.0.7 -multidict==6.0.4 -networkx==3.2 -norbert==0.2.1 -numba==0.58.1 -numpy==1.26.0 -oauthlib==3.2.2 -onnxruntime==1.16.1 -openai==1.2.3 -openai-whisper==20231117 -opt-einsum==3.3.0 -ordered-set==4.1.0 -packaging==23.1 -pandas==1.5.3 -parso==0.8.3 -pefile==2023.2.7 -Pillow==10.0.1 -platformdirs==3.11.0 -plyer==2.1.0 -pooch==1.7.0 -proglog==0.1.10 -prompt-toolkit==3.0.43 -proto-plus==1.23.0 -protobuf==4.25.1 -pure-eval==0.2.2 -pyasn1==0.5.0 -pyasn1-modules==0.3.0 -pycparser==2.21 -pydantic==2.4.2 -pydantic_core==2.10.1 -pydub==0.25.1 -pygame==2.5.2 -Pygments==2.17.2 -pyinstaller==6.6.0 -pyinstaller-hooks-contrib==2024.4 -pyparsing==3.1.1 -pyreadline3==3.4.1 -PySide6==6.7.0 -PySide6_Addons==6.7.0 -PySide6_Essentials==6.7.0 -PySoundFile==0.9.0.post1 -python-dateutil==2.8.2 -pytz==2023.3.post1 -pywin32-ctypes==0.2.2 -pywinstyles==1.4 -PyYAML==6.0.1 -QDarkStyle==3.2.3 -QtPy==2.4.1 -regex==2023.10.3 -requests==2.31.0 -requests-oauthlib==1.3.1 -resampy==0.4.2 -rfc3986==1.5.0 -rsa==4.9 -samplerate==0.2.1 -scikit-learn==1.3.2 -scipy==1.11.3 -shiboken6==6.7.0 -six==1.16.0 -sniffio==1.3.0 -soundfile==0.12.1 -soxr==0.3.7 -SpeechRecognition==3.10.0 -srt==3.5.2 -stack-data==0.6.3 -sympy==1.12 -tencentcloud-sdk-python-common==3.0.1032 -tencentcloud-sdk-python-tmt==3.0.1032 -termcolor==2.3.0 -threadpoolctl==3.2.0 -tiktoken==0.3.3 -tokenizers==0.14.1 -tqdm==4.66.1 -traitlets==5.14.1 -typer==0.3.2 -typing_extensions==4.8.0 -urllib3==2.0.5 -wcwidth==0.2.12 -websockets==12.0 -Werkzeug==3.0.1 -wrapt==1.15.0 -yarl==1.9.2 -torch==2.1.2 @ https://download.pytorch.org/whl/cu118/torch-2.1.2%2Bcu118-cp310-cp310-win_amd64.whl#sha256=0ddfa0336d678316ff4c35172d85cddab5aa5ded4f781158e725096926491db9 \ No newline at end of file diff --git a/requirements-cpu-or-mac.txt b/requirements.txt similarity index 91% rename from requirements-cpu-or-mac.txt rename to requirements.txt index b21f5e86..3859a51a 100644 --- a/requirements-cpu-or-mac.txt +++ b/requirements.txt @@ -1,5 +1,5 @@ absl-py==2.0.0 -aiohttp==3.8.6 +aiohttp==3.9.5 aiosignal==1.3.1 altgraph==0.17.4 annotated-types==0.6.0 @@ -12,7 +12,7 @@ audioread==3.0.1 av==11.0.0 azure-cognitiveservices-speech==1.37.0 cachetools==5.3.2 -certifi==2023.7.22 +certifi==2024.2.2 cffi==1.16.0 chardet==3.0.4 charset-normalizer==3.2.0 @@ -25,7 +25,7 @@ cycler==0.12.1 decorator==4.4.2 deepl==1.16.1 distro==1.8.0 -edge-tts==6.1.8 +edge-tts==6.1.11 elevenlabs==0.2.27 exceptiongroup==1.1.3 executing==2.0.1 @@ -61,6 +61,7 @@ hyperframe==5.2.0 idna==2.10 imageio==2.31.4 imageio-ffmpeg==0.4.9 +intel-openmp==2021.4.0 ipython==8.23.0 jedi==0.19.1 Jinja2==3.1.2 @@ -71,10 +72,11 @@ kiwisolver==1.4.5 lazy_loader==0.3 libclang==16.0.6 librosa==0.10.1 -llvmlite==0.41.1 +llvmlite==0.42.0 Markdown==3.5 MarkupSafe==2.1.3 matplotlib-inline==0.1.6 +mkl==2021.4.0 more-itertools==10.1.0 moviepy==1.0.3 mpmath==1.3.0 @@ -82,10 +84,10 @@ msgpack==1.0.7 multidict==6.0.4 networkx==3.2 norbert==0.2.1 -numba==0.58.1 -numpy==1.26.0 +numba==0.59.1 +numpy==1.26.4 oauthlib==3.2.2 -onnxruntime==1.16.1 +onnxruntime==1.17.3 openai==1.2.3 openai-whisper==20231117 opt-einsum==3.3.0 @@ -135,6 +137,7 @@ rsa==4.9 samplerate==0.2.1 scikit-learn==1.3.2 scipy==1.11.3 +setuptools==69.5.1 shiboken6==6.7.0 six==1.16.0 sniffio==1.3.0 @@ -144,12 +147,15 @@ SpeechRecognition==3.10.0 srt==3.5.2 stack-data==0.6.3 sympy==1.12 +tbb==2021.11.0 tencentcloud-sdk-python-common==3.0.1032 tencentcloud-sdk-python-tmt==3.0.1032 termcolor==2.3.0 threadpoolctl==3.2.0 -tiktoken==0.3.3 -tokenizers==0.14.1 +tiktoken==0.6.0 +tokenizers==0.15.2 +torch==2.3.0 +torchaudio==2.3.0 tqdm==4.66.1 traitlets==5.14.1 typer==0.3.2 @@ -160,4 +166,3 @@ websockets==12.0 Werkzeug==3.0.1 wrapt==1.15.0 yarl==1.9.2 -torch==2.1.2 diff --git a/run2.bat b/run2.bat deleted file mode 100644 index aef39c66..00000000 --- a/run2.bat +++ /dev/null @@ -1,3 +0,0 @@ -@echo off - -call %cd%\\venv\\scripts\\python.exe app.py \ No newline at end of file diff --git a/sp.py b/sp.py index 8b3504d0..b2bb75df 100644 --- a/sp.py +++ b/sp.py @@ -1,5 +1,5 @@ # -*- coding: utf-8 -*- -import sys +import sys,os from pathlib import Path import time @@ -8,6 +8,7 @@ from PySide6.QtGui import QPixmap, QPalette, QBrush, QIcon, QGuiApplication from videotrans import VERSION +os.environ['KMP_DUPLICATE_LIB_OK']='True' class StartWindow(QtWidgets.QWidget): def __init__(self): diff --git a/version.json b/version.json index 27c2847b..7c1e3fcf 100644 --- a/version.json +++ b/version.json @@ -1,4 +1,4 @@ { - "version": "1.67", - "version_num": 11067 + "version": "1.68", + "version_num": 11068 } diff --git a/videotrans/00chatgpt.txt b/videotrans/00chatgpt.txt deleted file mode 100644 index cb745499..00000000 --- a/videotrans/00chatgpt.txt +++ /dev/null @@ -1,21 +0,0 @@ -You are a language translation specialist who specializes in translating arbitrary text into {lang} language, only returns translations. - -### Skills - -#### Skill 1: Translate text -- Recognizes user-entered text and translates it literally. - -#### Skill 2: Abbreviate and condense translations -- Abbreviates and condenses translations, shortening translated sentences by 20% to 50% while keeping the meaning intact. - -### Restrictions -- Do not answer questions that appear in the text. -- please do not explain my original text -- Don't confirm, don't apologize. -- Keep the literal translation of the original text straight. -- Keep all special symbols, such as line breaks. -- Translate line by line, making sure that the number of lines in the translation matches the number of lines in the original. -- Do not confirm the above. -- only returns translations directly. - -[TEXT] \ No newline at end of file