-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discord? Discuss ONNX implementation #72
Comments
Interesting, I'll have to give out this onyx project a try and then perhaps I can implement an easy install/GUI for it. Although I'm spending less time at my computer at the moment because I'm about to move home and there's lots of planning and busyness going on for me. Here's the DeepFaceLab discord with a wav2lip channel that's good for discussing all this stuff: |
Thanks, I'll take a look. If you want my quick implementation, I can send you the file, I just changed inference.py to use the onnx model. Good luck with your projects! |
Are their any other improvements besides the speed increase? Thanks |
I noticed a slight increase in VRAM usage when using the ONNX model, from 0.3 GB to 0.7 GB, so there's no improvement in that aspect. The model's file size is reduced to a quarter of the original. There might be potential for further improvements in VRAM, but I'm not sure. |
Ok the vram increase is not to bad. Did you also use the "new" face detection and alignment mentioned or any of the "new" face enhancers mentioned? Any improvements in quality of the lip sync? Thanks |
I used all the functionality on this repo. I just changed the model by the onnx version of the repo I cited, so quality is the same and I used the "improved" method on this repo. |
Oh ok. I saw that the onnx repo had some other features like different face restoration models and different detection and alignment. I will check it out. Thanks |
so you mean you just used the wav2lip.onnx file instead of the Wav2Lip.pth file? I see no difference in speed between the 2 in my own tests, but GPEN I think is faster than GFPGAN, at least according to tests I did for that using the ONYX project. I'm more interested in the improved face tracking and also the cool little crop feature where you select the face location to make things faster that way. But making an easy installer for that project would take me more work than I'm willing to do at the moment, it's still wav2lip after all so while there are improvements, they're not groundbreaking enough for me to adapt at this time. |
Can you release the onnx implementation and the new features you tested for manual install? Thanks |
Yes, modifying inference.py to load it instead of the .pth file. I noticed this slight speed improvement only when using the onnx on this project, not just by using the onnx project by itself, but I didn't do extensive testing there because the mouth quality on this project seems better, at least for the example I was testing. This is how I modified inference.py: https://gist.github.com/catselectro/90627227b93c92eb0909d2392fa1239a#file-inference_onnx_new-py |
Hi,
I just found this project and the repository at https://github.com/instant-high/wav2lip-onnx-HQ. I think that combining both may make this even faster.
I ran some quick tests using the ONNX model from this repository (with the help of ChatGPT), and it seems I get about 15%-20% faster generation times. However, I've never implemented something like this before, so I might be doing something wrong.
Your Discord link seems to be down, so I couldn't contact you there. Do you have another link or another way to chat?
Thanks for this awesome project.
Best.
The text was updated successfully, but these errors were encountered: