jittery results #1

jnnyii · 2024-05-13T08:17:50Z

I plugged in a new dataset, after 1800 epochs, I see that semantically, the generation appears to follow the text conditioning, but the poses are too jittery (please see attachment). Could you maybe point out what's wrong?

text: straightening up
https://github.com/dongzhuoyao/motionfm/assets/169649811/b4fa0242-97d5-4080-827f-3578c3cd0d84

thanks!

dongzhuoyao · 2024-05-13T21:18:07Z

Hi, thanks for your interest to our work, could you elaborate how large your dataset is, waht your text encoder is, and how large the network is? what's your sampler and sampling steps?

jnnyii · 2024-05-14T14:22:50Z

Thank you for your response. I have a dataset that contains 600 sequences with a total of 34000 frames. My step size is 1 and if the sequence length is larger than the maximum number of frames, I randomly select a start index. I am using the default text encoder in the framework, i.e. CLIP. Do you think the have too little data? I don't observe this problem when I train using the diffusion model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jittery results #1

jittery results #1

jnnyii commented May 13, 2024 •

edited

Loading

dongzhuoyao commented May 13, 2024

jnnyii commented May 14, 2024

jittery results #1

jittery results #1

Comments

jnnyii commented May 13, 2024 • edited Loading

dongzhuoyao commented May 13, 2024

jnnyii commented May 14, 2024

jnnyii commented May 13, 2024 •

edited

Loading