How to feed VAD detection results to diarization pipeline #844

kafan1986 · 2022-01-03T13:39:19Z

kafan1986
Jan 3, 2022

I have already performed VAD step using a different model. If required, I can create a Pynote.Audio.Annotation object from my VAD data. I believe this data more or less represent the segmentation stage output. Is my understanding correct?

Now question is how can I feed this object to embedding and clustering step for performing diarization? I am using the develop branch currently.

hbredin · 2022-01-07T10:17:35Z

hbredin
Jan 7, 2022
Maintainer

There is no easy way to feed your VAD into the existing diarization pipeline as the latter does not explicitely rely on a VAD step.

You'd have to design your own speaker diarization pipeline. You can use PretrainedSpeakerEmbedding to extract embeddings and then any clustering algorithm from scikit-learn for instance.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to feed VAD detection results to diarization pipeline #844

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

How to feed VAD detection results to diarization pipeline #844

kafan1986 Jan 3, 2022

Replies: 1 comment

hbredin Jan 7, 2022 Maintainer

kafan1986
Jan 3, 2022

hbredin
Jan 7, 2022
Maintainer