Conv-SINet

Fully-convolutional speakers identification solution

General architecture:

Directories:

Data is reponsible to get data-set and transform the data.
Encoder is the deep learning part of the project that takes samples and produce vectors.
Identification uses vectors to identify the speaker.
Localization is in charge to determine the place of the speaker.

Data: The data-set used is voxCeleb.

Encoder: We compare two approaches, one that use STFT, the other is fully temporal.

"TransFourier.py" is main STFT pyton file.
"Time.py" is main pyton file for time approach.
Freq directory is for notebook related to STFT approach. We try 2 ways for vectors distances (euclidian distance vs cosine). We also try 3 differents sample size (1, 2 or 3 seconds).
Time directory is for notebook related to the fully temporal approach. We try 3 differents sample size (1, 2 or 3 seconds).

Identification:

"Conference.py" is a class that correspond to a conference.
"Place.py" is a class that correspond to a specific place in the conference room.
"Speaker.py" is a class that correspond to a speaker involved in the conference. We continue to compare time vs STFT approach. 20 and 40 is for the nb of speaker in the conference.
"xxx_enc_1_xxx.ipynb" is for using encoder train with 1 second sample.
"xxx_sample_3_xxx.ipynb" is for using 3 seconds sample for the identification.
"xxx_speaker_xxx.ipynb" is for identify a speaker with one sample.
"Place_after_3_samples_xxx.ipynb" is for identify a speaker on a place with a concatenation of 3 samples.
"xxx_1_ref_in_pool.ipynb" is for identify speaker with one reference in the pool.

Localization:

This part is under process.
The goal is to determine the delay between the mics.
We try different output vecor size (16, 32, 64 or 128).
"LocateEncoder.py" contain the specific encoder class in charge of localization.
That is fully time encoder responsibles for the concordance of points of interest between mics sample.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
Data		Data
DataSet		DataSet
Encoder		Encoder
Identification		Identification
Localization		Localization
EncoderArchi.jpg		EncoderArchi.jpg
InternshipM2AIC.pdf		InternshipM2AIC.pdf
README.md		README.md
Slides.pdf		Slides.pdf
meeting.png		meeting.png

Provide feedback