Skip to content

fgnt/tssep_data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings

IEEE DOI arXiv

This repository contains the data preparation and evaluation code for the TS-VAD and TS-SEP experiments in our 2024 IEEE/ACM TASLP article, TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings by Christoph Boeddeker, Aswin Shanmugam Subramanian, Gordon Wichern, Reinhold Haeb-Umbach, Jonathan Le Roux (IEEE Xplore, arXiv).

The core and training code is available at https://github.com/merlresearch/tssep .

Installation

Using an existing environment, you can install the data preparation code with:

git clone https://github.com/merlresearch/tssep.git
cd tssep
pip install -e .
cd ..
git clone https://github.com/fgnt/tssep_data.git
cd tssep_data
pip install -e .

If you want so setup a fresh environment, see tools/README.md. Once you have installed a fresh environment, you can activate it with . tools/path.sh (It will also setup some environment variables).

Note: Kaldi and MPI are required for the recipes. For ASR, you can use openai-whisper, espnet or nemo_toolkit as alternatives. ToDo: Limit this to whisper, it has less dependencies.

LibriCSS data preparation, training and evaluation

egs/libri_css/README.md#steps-to-run-the-recipe contains the instructions for the LibriCSS data preparation, training and evaluation.

LibriCSS evaluation with pretrained model

egs/libri_css/README.md#steps-to-evaluate-a-pretrained-model contains the instructions for the LibriCSS evaluation with a pretrained model.

Cite

If you are using this code please cite our paper (IEEE DOI arXiv):

@article{Boeddeker2024feb,
    author = {Boeddeker, Christoph and Subramanian, Aswin Shanmugam and Wichern, Gordon and Haeb-Umbach, Reinhold and Le Roux, Jonathan},
    title = {{TS-SEP}: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings},
    journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
    year = 2024,
    volume = 32,
    pages = {1185--1197},
    month = feb,
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages