Discord? Discuss ONNX implementation #72

catselectro · 2024-06-08T09:26:43Z

Hi,

I just found this project and the repository at https://github.com/instant-high/wav2lip-onnx-HQ. I think that combining both may make this even faster.

I ran some quick tests using the ONNX model from this repository (with the help of ChatGPT), and it seems I get about 15%-20% faster generation times. However, I've never implemented something like this before, so I might be doing something wrong.

Your Discord link seems to be down, so I couldn't contact you there. Do you have another link or another way to chat?

Thanks for this awesome project.

Best.

anothermartz · 2024-06-08T09:34:58Z

Interesting, I'll have to give out this onyx project a try and then perhaps I can implement an easy install/GUI for it.

Although I'm spending less time at my computer at the moment because I'm about to move home and there's lots of planning and busyness going on for me.

Here's the DeepFaceLab discord with a wav2lip channel that's good for discussing all this stuff:

https://discord.com/invite/9scUkmcf8V

catselectro · 2024-06-08T09:44:20Z

Thanks, I'll take a look. If you want my quick implementation, I can send you the file, I just changed inference.py to use the onnx model. Good luck with your projects!

Echolink50 · 2024-06-08T13:53:26Z

Are their any other improvements besides the speed increase? Thanks

catselectro · 2024-06-08T17:16:22Z

I noticed a slight increase in VRAM usage when using the ONNX model, from 0.3 GB to 0.7 GB, so there's no improvement in that aspect. The model's file size is reduced to a quarter of the original. There might be potential for further improvements in VRAM, but I'm not sure.

Echolink50 · 2024-06-08T17:41:58Z

Ok the vram increase is not to bad. Did you also use the "new" face detection and alignment mentioned or any of the "new" face enhancers mentioned? Any improvements in quality of the lip sync? Thanks

catselectro · 2024-06-08T19:17:03Z

I used all the functionality on this repo. I just changed the model by the onnx version of the repo I cited, so quality is the same and I used the "improved" method on this repo.

Echolink50 · 2024-06-08T20:17:23Z

Oh ok. I saw that the onnx repo had some other features like different face restoration models and different detection and alignment. I will check it out. Thanks

anothermartz · 2024-06-08T22:00:35Z

I used all the functionality on this repo. I just changed the model by the onnx version of the repo I cited, so quality is the same and I used the "improved" method on this repo.

so you mean you just used the wav2lip.onnx file instead of the Wav2Lip.pth file?

I see no difference in speed between the 2 in my own tests, but GPEN I think is faster than GFPGAN, at least according to tests I did for that using the ONYX project.

I'm more interested in the improved face tracking and also the cool little crop feature where you select the face location to make things faster that way.

But making an easy installer for that project would take me more work than I'm willing to do at the moment, it's still wav2lip after all so while there are improvements, they're not groundbreaking enough for me to adapt at this time.

Echolink50 · 2024-06-08T22:09:47Z

I used all the functionality on this repo. I just changed the model by the onnx version of the repo I cited, so quality is the same and I used the "improved" method on this repo.

so you mean you just used the wav2lip.onnx file instead of the Wav2Lip.pth file?

I see no difference in speed between the 2 in my own tests, but GPEN I think is faster than GFPGAN, at least according to tests I did for that using the ONYX project.

I'm more interested in the improved face tracking and also the cool little crop feature where you select the face location to make things faster that way.

But making an easy installer for that project would take me more work than I'm willing to do at the moment, it's still wav2lip after all so while there are improvements, they're not groundbreaking enough for me to adapt at this time.

Can you release the onnx implementation and the new features you tested for manual install? Thanks

catselectro · 2024-06-08T22:48:52Z

I used all the functionality on this repo. I just changed the model by the onnx version of the repo I cited, so quality is the same and I used the "improved" method on this repo.

so you mean you just used the wav2lip.onnx file instead of the Wav2Lip.pth file?

Yes, modifying inference.py to load it instead of the .pth file. I noticed this slight speed improvement only when using the onnx on this project, not just by using the onnx project by itself, but I didn't do extensive testing there because the mouth quality on this project seems better, at least for the example I was testing. This is how I modified inference.py: https://gist.github.com/catselectro/90627227b93c92eb0909d2392fa1239a#file-inference_onnx_new-py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discord? Discuss ONNX implementation #72

Discord? Discuss ONNX implementation #72

catselectro commented Jun 8, 2024

anothermartz commented Jun 8, 2024 •

edited

Loading

catselectro commented Jun 8, 2024

Echolink50 commented Jun 8, 2024

catselectro commented Jun 8, 2024

Echolink50 commented Jun 8, 2024

catselectro commented Jun 8, 2024

Echolink50 commented Jun 8, 2024

anothermartz commented Jun 8, 2024

Echolink50 commented Jun 8, 2024

catselectro commented Jun 8, 2024

Discord? Discuss ONNX implementation #72

Discord? Discuss ONNX implementation #72

Comments

catselectro commented Jun 8, 2024

anothermartz commented Jun 8, 2024 • edited Loading

catselectro commented Jun 8, 2024

Echolink50 commented Jun 8, 2024

catselectro commented Jun 8, 2024

Echolink50 commented Jun 8, 2024

catselectro commented Jun 8, 2024

Echolink50 commented Jun 8, 2024

anothermartz commented Jun 8, 2024

Echolink50 commented Jun 8, 2024

catselectro commented Jun 8, 2024

anothermartz commented Jun 8, 2024 •

edited

Loading