-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Real-ESRGAN, plus various fixes (hangs, async video, etc.) #1133
Conversation
Choosing realesrgan ignores noise flag for now and always uses the realesrgan-x4plus model.
A few scalers output a lot of crap like: 10% 25% 50% ... That messes with the progress bar indicator.
This allows a reproducible way to get all the needed dependencies and allows running the software in NixOS environments.
This solves a couple of issues: 1. The log level passed with -l was not properly applied to loguru, it was always using `debug`, the default. 2. When actually passing `-l debug` ffmpeg would flood us with too much information, it is better to have a separate option to debug ffmpeg issues.
This can actually lead to corrupted videos, or with bad synchronization or without skipping support. This option seems to do the right thing for this and is inocuous in our case.
...this would cause the whole application to hang during the teardown sequence of the application.
Since now these are only printed when passing -l debug, I think this is an acceptable compromise and can help debug the various problems that have existed with the application hanging
This problem used to be more severe, but has become less frequent with our fix for k4yt3x#1132 The problem happens because we used to take `frame_count` as an absolute truth, and we would iterate until we had processed that many frames. However, as we've learnt, this is just an estimate: it can happen that the `Decoder` thread is done before we hit that frame count. With this change, we detect that scenario, and gracefully finish the rendering process, making sure that all pending frames have been read and processed and get written in the final stream.
Some anime files in particular like to include custom fonts and stuff like that in these streams. I think it is useful to keep them as to keep the generated file as close to the original as possible.
in https://github.com/arximboldi/video2x/blob/realesrgan/video2x/video2x.py#L131C1-L136C9 causes a fail where pynput is not available. Because Python tries to subclass from Listener. You should define that class in a scoped way. |
@arximboldi Also when I try running with realesrgan on Google Colab, I get
(which seems to be referenced in https://github.com/arximboldi/video2x/blob/realesrgan/video2x/decoder.py ) |
Turns out that 'fps_mode' is a relatively new option in ffmpeg, and the standard Google Colab environment installs an older ffmpeg. But I was able to work around it with:
|
Similarly, to work around the poor integration of
But that’s not how it should work :) |
Thank you all @arximboldi @aa-ko @twardoch for the amazing work here. I've been very busy with work and several other non-FOSS projects over the last half of this year. Sorry that I missed this PR came in. I've been just getting way too many messages from GitHub and I didn't notice this PR was submitted. I only saw if after I completed the 6.0.0 rewrite. It has always been my goal to introduce RealESRGAN in Video2X. I managed to do it in the rewrite. Take a look if you're still interested. Unfortunately I can't merge this without destroying the commit history, so I'll just have to close this. Sorry again for not responding to this in time. I really appreciate your amazing work! |
@k4yt3x Wow, you actually just casually dropped the C++ rewrite, I am speechless 😮 As far as I can tell, this looks exactly like what I was hoping for. I'll try 6.0.0 ASAP, maybe I can contribute something this time lmao Thanks for all the hard work, cheers! 🎉 |
I’ll try it ASAP on Colab |
@twardoch I haven't updated the Colab playbook yet. You'll need to compile it if you wanna try colab. |
I guess so. I wonder if you could provide simple building instructions on how to build the stuff. I can adapt it to macOS and will PR. So far I had installed various libs on macOS via brew and then "make" failed on realesrgan being available. I will look at your github action for how you build on Linux and will try to adapt for macOS |
@twardoch if you can actually make it work for mac that'll be amazing. I never had a mac in my life so it'll be hard for me to make that work. As for steps to build, here's for Debian/Ubuntu: Lines 36 to 54 in 411cca4
Another one is for Arch in PKGBUILD. |
The Homebrew package installer for macOS does include "ncnn", but I'm not sure if that means that anything will have to be adapted from vulkan to another backend. Right now the macOS situation for AI is very complicated because drastically different backends are used for the new Apple Silicon Macs (where there are more inference acceleration possibilities) vs. the older Intel Macs (which often can do only CPU inference). |
Ps. Many local AI packages only exist for the Apple M-Silicon Macs because the older Apple Intel hardware is basically completely unsuitable for local AI inference. For example Topaz Video AI runs like 50x faster on my MacBook Air M3 than on a beefy MacBook Pro Intel that's less than 3 years older. So it's running time of 3 minutes vs. 2 hours to complete the same task. |
That sounds like a mac problem, I'm skeptical that even without a npu a properly accelerated coffee lake could be so behind Anyhow, is RESRGAN supposed to be the only supported driver now? |
@mirh there's also libplacebo, which renders Anime4K v4 now, but it should be compatible with any MPV compatible GLSL RealESRGAN has both an anime model and a real-life model. From their paper it also looks like the performance is better than RealSR (?), so I didn't bother adding RealSR. The other ones are kinda old so I didn't bother either. |
Their paper speaks truth, and all things considered I would also probably recommend this as the "overall" default. Similarly with an anime DVD, while not "wildly making stuff up" (when the camera pans a scene with a corrugated iron roof and a metal fence it seems like they are dancing with CUGAN) you can still slightly tell in certain places that it is trying too much (so for this reason waifu2x seems a solid "safe ground"). And last but not least, just a few months ago I figured out that somehow SRMD (which I had always thought to be always strictly inferior to the alternatives at least as far as quality was concerned) could give me the best results bar none with a 369x207 crop of a picture from my phone. p.s. Anime4K if any seems a bit pointless. Exactly because it's as lightweight as it is fairly low quality, I don't think it's the kind of upscaler that people would use in a complex tool like this. |
I've been away for long enough to not know what the best options are anymore. In addition to collecting opinions I'd also like to have the decision of what to add backed by testing results like VMAF (mostly because adding support for new solutions is time consuming). It's a bit beyond my ability to do it all by myself. Ideally we create an environment where people can discuss and vote... perhaps try bringing this up in the telegram group or something? |
I'm also very much out of the loop, but honestly there hasn't been that much activity in the last years (I assume there's just so much you can do with tiny generic networks?). At most there might be slightly different trained models (that is a madhouse tbf)? As for "objective testing®" I cannot recommend enough FFMetrics (and then maybe this?). |
Both look interesting. Perhaps I should set up some kind of a feature poll in the discussions for people to discuss and vote for any new model that should be implemented. My specialty is not actually in AI nor CV, so I would really like if people actually from those fields can bring up and discuss what's worth implementing. |
Having eyes (and probably lot of time to waste) is probably more important here than knowledge.. like, papers do it too. One super dope thing that is missing if any, is some easy automatic way to test/compare different solutions (a bit like this perhaps, but for models rather than compression settings.. even though those would be interesting too now that I think). |
This picks up the work in #1102 by @aa-ko to introduce Real-ESRGAN support, adding support for all models exposed by
realesrgan-ncnn-py
and testing it outside of a Docker container.Additionally, I have fixed a few issues that I've found. I wasn't sure whether you prefer to have multiple smaller PR, or to just review all at once. The work is separated into multiple commits so I can still split into multiple PR's if you prefer.
These are the highlights from this PR:
There are also some smaller changes that I'm not sure you agree with, but that I think are quite convenient:
shell.nix
environment that allows installing all the system dependencies in an isolated environment, by simply runningnix-shell
. This was crucial for me to be able to run and test the program locally outside of a Docker container-l
flag, and introduction of a new-L
flag).Thank you @aa-ko for starting the work on integrating Real-ESRGAN. It's model
realesr-animevideov3
is giving me incredible results, with image quality comparable or better for anime thanrealcugan
, yet much faster!Thank you @k4yt3x for this incredible tool. Looking into the code has shown me how much love there is into it and it is working super well for me now!