Pretrained Models

Jump to bottom

Yist edited this page Dec 16, 2020 · 1 revision

The following models were trained on VoxCeleb1 dataset.

Model details:

40-dim mel spectrogram as input
3 layers of LSTM with hidden dimensions being 256
256-dim speaker embedding

Download links:

Clone this wiki locally