Skip to content

Releases: yistLin/dvector

D-vector trained on VoxCeleb1

12 May 02:39
Compare
Choose a tag to compare

Pretrained models

The model was trained on VoxCeleb1 dataset.

Model details:

  • 40-dim log mel spectrogram as input
  • 3-layer LSTM with hidden dimensions being 256
  • 256-dim attentive pooled speaker embedding

Training details:

  • 64 speakers, 10 utterances per speaker in a batch
  • 250K steps

D-vector trained on VoxCeleb1

25 Jan 06:05
Compare
Choose a tag to compare

This release is to address the module loading issue after upgrading torchaudio to 0.8.0

Pretrained models

The model was trained on VoxCeleb1 dataset.

Model details:

  • 40-dim log mel spectrogram as input
  • 3-layer LSTM with hidden dimensions being 256
  • 256-dim attentive pooled speaker embedding

Training details:

  • 64 speakers, 10 utterances per speaker in a batch
  • 250K steps