EEND (End-to-End Neural Diarization) is a neural-network-based speaker diarization method.
- https://www.isca-speech.org/archive/Interspeech_2019/abstracts/2899.html
- https://arxiv.org/abs/1909.06247 (to appear at ASRU 2019)
- NVIDIA CUDA GPU
- CUDA Toolkit (8.0 <= version <= 10.1)
cd tools
make
- This command builds kaldi at
tools/kaldi
- if you want to use pre-build kaldi
This option make a symlink at
cd tools make KALDI=<existing_kaldi_root>
tools/kaldi
- if you want to use pre-build kaldi
- This command extracts miniconda3 at
tools/miniconda3
, and creates conda envirionment named 'eend' - Then, installs Chainer and cupy into 'eend' environment
- use CUDA in
/usr/local/cuda/
- if you need to specify your CUDA path
This command installs cupy-cudaXX according to your CUDA version. See https://docs-cupy.chainer.org/en/stable/install.html#install-cupy
cd tools make CUDA_PATH=/your/path/to/cuda-8.0
- if you need to specify your CUDA path
- use CUDA in
- Modify
egs/mini_librispeech/v1/cmd.sh
according to your job schedular. If you use your local machine, use "run.pl". If you use Grid Engine, use "queue.pl" If you use SLURM, use "slurm.pl". For more information about cmd.sh see http://kaldi-asr.org/doc/queue.html.
cd egs/mini_librispeech/v1
./run_prepare_shared.sh
./run.sh
- See
RESULT.md
and compare with your result.
- Modify
egs/callhome/v1/cmd.sh
according to your job schedular. If you use your local machine, use "run.pl". If you use Grid Engine, use "queue.pl" If you use SLURM, use "slurm.pl". For more information about cmd.sh see http://kaldi-asr.org/doc/queue.html. - Modify
egs/callhome/v1/run_prepare_shared.sh
according to storage paths of your copora.
cd egs/callhome/v1
./run_prepare_shared.sh
./run.sh
local/run_blstm.sh
[1] Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, Shinji Watanabe, " End-to-End Neural Speaker Diarization with Permutation-free Objectives," Proc. Interspeech, pp. 4300-4304, 2019
[2] Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu, Shinji Watanabe, " End-to-End Neural Speaker Diarization with Self-attention," arXiv preprints arXiv:1909.06247, 2019
@inproceedings{Fujita2019Interspeech,
author={Yusuke Fujita and Naoyuki Kanda and Shota Horiguchi and Kenji Nagamatsu and Shinji Watanabe},
title={{End-to-End Neural Speaker Diarization with Permutation-free Objectives}},
booktitle={Interspeech},
pages={4300--4304}
year=2019
}