Style / Prosody guidence? #23

richardburleigh · 2022-11-10T05:17:26Z

This is really impressive work!

Do you have any ideas or code changes to guide the generated speech style? For example, having the appropriate emotion if we are reading a news story about a tragedy.

I understand there are a few Tacotron projects that achieve this, but their methods often lead to degraded voice quality (in my opinion).

One crazy idea that is easy to try, but probably won't work, is to train on a new dataset and embed the emotion into the generated sequence encoding.

shivammehta25 · 2022-11-10T10:01:34Z

Hello! Thank you for your interest in our work!

We tried something similar to have more control over the synthesised speech in one of our succeeding works which is submitted already.
But the key idea we used, is that you can use the additional information by having an additional encoder process into some controllable space. Then the concatenation of the output of the additional encoder with the output of the text encoder will form the input sequence of states to the neural HMM. In order to provide stronger conditioning we also concatenated the additional information into the output net of the system. It worked and gave us control over the feature space.

Hope this helps! Feel free to ping in case you have further questions :D

Also, PS: In the upcoming days, we are releasing an upgraded model to neural HMM that performs better than the other baseline systems including Neural-HMM, Tacotron 2 with Postnet and Glow TTS in terms of clarity and naturalness.

shivammehta25 · 2022-11-23T14:47:06Z

Hello!

We have released OverFlow. In OverFlow not only do we get better naturalness and more accurate pronunciations but we also show speaker adaptation in a low resource setting by simply fine-tuning the model with a way smaller dataset.

Hopefully, it will be useful for your use case as well.

richardburleigh · 2022-11-25T00:12:18Z

Amazing work!! Really appreciate you and the team for pushing TTS further 🥇

shivammehta25 added documentation Improvements or additions to documentation enhancement New feature or request labels Nov 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Style / Prosody guidence? #23

Style / Prosody guidence? #23

richardburleigh commented Nov 10, 2022

shivammehta25 commented Nov 10, 2022 •

edited

Loading

shivammehta25 commented Nov 23, 2022

richardburleigh commented Nov 25, 2022

Style / Prosody guidence? #23

Style / Prosody guidence? #23

Comments

richardburleigh commented Nov 10, 2022

shivammehta25 commented Nov 10, 2022 • edited Loading

shivammehta25 commented Nov 23, 2022

richardburleigh commented Nov 25, 2022

shivammehta25 commented Nov 10, 2022 •

edited

Loading