How streaminig evaluation is performed for Wav2Vec2.0 Model #3

steventan0110 · 2023-11-26T02:00:07Z

Hi authors,

Thanks for releasing the code for your great work! After reading the paper, I am trying to replicate the simultaneous translation experiment using CIF's output as the pre-decision policy.

However, I have a question about the underlying model used -- Wav2Vec2.0. From my understanding, it requires the complete audio input to encode the feature because it uses normal (non-causal) convnet and tf-encoder and does not support streaming, so how is the inference performed for the wav2vec_cif model? From the paper, it seems that you didn't perform any additional training for Wav2Vec2 to make it support partial input during stream-like inference.

I would greatly appreciate it if you could provide additional details about how evaluation is performed for the streaming case!

steventan0110 added the question Further information is requested label Nov 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How streaminig evaluation is performed for Wav2Vec2.0 Model #3

How streaminig evaluation is performed for Wav2Vec2.0 Model #3

steventan0110 commented Nov 26, 2023

How streaminig evaluation is performed for Wav2Vec2.0 Model #3

How streaminig evaluation is performed for Wav2Vec2.0 Model #3

Comments

steventan0110 commented Nov 26, 2023