-
Notifications
You must be signed in to change notification settings - Fork 918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Releasing RNNoise version 0.2 #221
Comments
[Jean-Marc Valin]
This new release brings many improvements, including improved
training, SSE4.1 and AVX2 optimizations, and run-time CPU detection
(RTCD).
The distributed models are now trained using only publicly available
datasets.
This sound very good. How can I learn how to do the training myself?
That knowledge would be a great argument for getting this into the
Debian main distribution.
<URL: https://bugs.debian.org/980839 >
…--
Happy hacking
Petter Reinholdtsen
|
This release expects the build process to execute a arbitrary-shellscript to download an arbitrary file as part of the build process; outside of cmake, in a potentially privileged environment. If I can find the time, I will PR fix for your Cmake or alike; I'm sorry I'm not doing it now - because it's very exciting to me this project is moving again. But as it stands, this is a pretty big hole for practical secure CI-pipelines and reliability in isolated or secure environments. THAT SAID: I'm honestly glad this project is getting some much needed attention; but please remember that it's been relied upon for various uses, on various platforms for a very long time now. If your not considering the need to build this as a standalone Risc-V library for a non-linux operating-system; You should be. And many, many more that I'm aware of. Keep up the good work, do good things, and have pride - just don't forget the use-cases beyond your own. ;) |
[John Bargman]
This release expects the build process to execute a
arbitrary-shellscript to download an arbitrary file as part of the
build process; outside of cmake, in a potentially privileged
environment.
This sound like a prospective xz disaster, indeed. Where is the code
causing this? At least the Debian build servers do not all have
Internet access during build, so it would fail at least some
architectures.
…--
Happy hacking
Petter Reinholdtsen
|
So there's two things here: one is the download script and the other is the actual content being fetched. In terms of the script itself, it's fairly simple to audit and see that it's not doing anything bad. It's also in git and signed, so I don't really see a trust issue compared to trusting the rest of the source code. When it comes the the actual model data being fetched, I can see how it's not as safe as the rest of the Git content (sha1 and signature). One possible way to solve that would be the equivalent to what was just added to Opus: The idea Is that Git will hold the sha256 of the downloaded content and thus it can check that the checksums actually match before using it. That way compromising the server will just result in a build failure. |
This is without question an acceptable solution - I come from a version of linux called NixOS; our package manager does this for all external downloads as part of the build step; ( which is how I came to be here: NixOS/nixpkgs#304433 ) The shellscript itself is an acceptable compromise if an external download is absolutely required; in my personal ideal external data would be fetched by git submodule (providing a solid revision control), or the dependency treated as such. It seems the model for RNNoise has always been a dependency; |
[Jean-Marc Valin]
So there's two things here: one is the download script and the other
is the actual content being fetched. In terms of the script itself,
it's fairly simple to audit and see that it's not doing anything
bad. It's also in git and signed, so I don't really see a trust issue
compared to trusting the rest of the source code.
Most Linux distributions block package building from downloading stuff
from the Internet, to ensure reproducibility, consistency and
completeness, and I would strongly suggest to not base the RNNoise build
on such practice for rnnoise.
…--
Happy hacking
Petter Reinholdtsen
|
@petterreinholdtsen Do you have alternatives to suggest for distributing a 20 MB binary file? In version 0.1, the model was much smaller (and it still wasn't great) and didn't include the binary model (which is the preferred form for making modifications in that case). |
[Jean-Marc Valin]
@petterreinholdtsen Do you have alternatives to suggest for
distributing a 20 MB binary file? In version 0.1, the model was much
smaller (and it still wasn't great) and didn't include the binary
model (which *is* the preferred form for making modifications in that
case).
I would suggest to include it in the git repository and release
tarballs.
…--
Happy hacking
Petter Reinholdtsen
|
All package managers I know of have the ability to specify multiple source packages - I'd expect them to specify the model data as one of the sources. The nixpkgs approach seems good. Having a hash of the data in git should be enough for revision control to ensure that it has the same version of the model as the code was designed for. Submodules remain tricky to use, especially because the model is large and most people would want a shallow clone of it. |
OK, the download process should be safer now as the download script will verify the sha256 before using the downloaded binary file. |
[Thomas Daede]
Having a hash of the data in git should be enough for revision control
to ensure that it has the same version of the model as the code was
designed for.
Note, for release tarballs, such approach will break the deserted island
test of free software. Those bringing the tarball and all its build
dependencies with you to a computer on a deserted island with no
Internet connectivity would not be able to build it as the build require
Internet connectivity and the abililty to download the model without
interferrence.
I heard a variant of the deserted island a few years ago, where the
people gathering together at AfricaSource, a free software development
confernece in the middle of a savannah, with no Internet connectivity,
used a collecting of Debian DVDs as their source of build dependencies.
…--
Happy hacking
Petter Reinholdtsen
|
I have to agree with this statement - although one would hope that the dependency will find it's way onto a debian disk too ;) |
Unless I missed something, the RNNoise 0.2 passes the desert island test fine. The download mechanism only applies to Git. |
[Jean-Marc Valin]
Unless I missed something, the RNNoise 0.2 passes the desert island
test fine. The download mechanism only applies to Git.
Sound good. From where can I download a tarball with the model
included, to verify no download from Internet take place during build?
…--
Happy hacking
Petter Reinholdtsen
|
On the Releases page: https://github.com/xiph/rnnoise/releases/download/v0.2/rnnoise-0.2.tar.gz |
This new release brings many improvements, including improved training, SSE4.1 and AVX2 optimizations, and run-time CPU detection (RTCD).
The distributed models are now trained using only publicly available datasets.
The text was updated successfully, but these errors were encountered: