A Voice Type Classifier For Child-Centered Daylong Recordings

This is the git repository associated to our Interspeech 2020 publication : An open-source voice type classifier for child-centered daylong recordings

In this repository, you'll find all the necessary code for applying a pre-trained model that, given an audio recording, classifies each frame into [SPEECH, KCHI, CHI, MAL, FEM].

FEM stands for female speech
MAL stands for male speech
KCHI stands for key-child speech
CHI stands for other child speech
SPEECH stands for speech :)

Our model's architecture is based on SincNet [3] and LSTM layers. Details can be found in our paper [1]. The code mainly relies on pyannote-audio [2], an awesome python toolkit for building neural building blocks that can be combined to solve the speaker diarization task.

How to use ?

Disclaimer /!\
Installation
Applying
Evaluation
Going further
Still stuck or feeling lost? Check out our IASCL24 tutorial (more extensive instructions): slides 9 to 20

Awesome tools using our voice type classifier

ALICE, an Automatic Linguistic Unit Count Estimator, allowing you to count the number of words, syllables and phonemes in adult speakers' utterances :

References

The main paper :

[1] An open-source voice type classifier for child-centered daylong recordings

@inproceedings{lavechin2020opensource,
title={An open-source voice type classifier for child-centered daylong recordings},
author={Marvin Lavechin and Ruben Bousbib and Hervé Bredin and Emmanuel Dupoux and Alejandrina Cristia},
year={2020},
booktitle = {Interspeech}
}

We also encourage you to cite this work :

[2] pyannote.audio: neural building blocks for speaker diarization

@inproceedings{Bredin2020,
  Title = {{pyannote.audio: neural building blocks for speaker diarization}},
  Author = {{Bredin}, Herv{\'e} and {Yin}, Ruiqing and {Coria}, Juan Manuel and {Gelly}, Gregory and {Korshunov}, Pavel and {Lavechin}, Marvin and {Fustes}, Diego and {Titeux}, Hadrien and {Bouaziz}, Wassim and {Gill}, Marie-Philippe},
  Booktitle = {ICASSP 2020, IEEE International Conference on Acoustics, Speech, and Signal Processing},
  Address = {Barcelona, Spain},
  Month = {May},
  Year = {2020},
}

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
docker		docker
docs		docs
model		model
pyannote-audio @ 85b84bc		pyannote-audio @ 85b84bc
.gitmodules		.gitmodules
README.md		README.md
apply.sh		apply.sh
check_install_iascl.sh		check_install_iascl.sh
vtc.yml		vtc.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Voice Type Classifier For Child-Centered Daylong Recordings

How to use ?

Awesome tools using our voice type classifier

References

About

Releases

Packages

Contributors 4

Languages

MarvinLvn/voice-type-classifier

Folders and files

Latest commit

History

Repository files navigation

A Voice Type Classifier For Child-Centered Daylong Recordings

How to use ?

Awesome tools using our voice type classifier

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages