This is an ncnn implementation of the VITS library that supports cross-platform GPU-accelerated speech synthesis.
The project is forked from weirdseed/Vits-Android-ncnn, Thanks to the original author for their contribution.
-
🔍 Prepare dependencies
Get the ncnn static library suitable for your runtime environment from its wiki or ncnn/releases.
Place it in the root directory ncnn folder like this:ncnn ├─bin ├─include └─lib
-
🛠️ Compile the project
a. libvits-ncnn
Execute in the repo root directory:mkdir build && cd build cmake .. -DCMAKE_CXX_COMPILER=/usr/bin/clang++ -DCMAKE_C_COMPILER=/usr/bin/clang make
After compilation, you can find
libvits-ncnn.so
in the build directory.b. vits-cli
Enter thedemo
directory and execute:mkdir build && cd build cmake .. -DCMAKE_CXX_COMPILER=/usr/bin/clang++ -DCMAKE_C_COMPILER=/usr/bin/clang make
After compilation, you can find
vits-cli
in the build directory. -
🚀 Run the demo
In the directory where
vits-cli
is located, prepare the dependencies required for running:a. Download the openjtalk dictionary file and unzip it;
b. Download the VITS ncnn model, unzip the Atri part to theatri/
directory (for testing the monophonic model), and extract365_epochs
to the365_epochs/
directory (for testing the multi-tone model).
c. Download the VITS ncnn params (single
/multi
directories).At this time, the directory has:
build ├─vits-cli ├─365_epochs ├─atri ├─multi ├─open_jtalk_dic_utf_8-1.11 └─single
Now execute
./vits-cli
.
- JNI export interface changed to C++
- C++ re-implemented preprocessing
- Support for Japanese monophonic model
- Support for Japanese multi-tone model
- Support for Chinese monophonic model
- Support for Chinese multi-tone model