-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discussion] Minimize unnecessary resampling #646
base: master
Are you sure you want to change the base?
Conversation
Also partially relieves RustAudio#208. The rate converter has to be fixed still but with default sample rate less resampling is needed.
A more advanced solution could allow processing steps to query current preferred sample rate and use that. With such API one could for example open output stream with 48kHz and all processing steps attached to it will adjust accordingly. Or make sample rate a parameter for every processing step that may be affected by it. Which cumbersome but could still help keeping 48kHz everywhere if one needs that. Alternatively it could be some global switch that can be read by all Rodio code, but I am not sure I like it since users may want more than one output stream with different sample rates. |
that could look like: impl trait Source for Something {
fn negotiate_samplerate(&mut self, requested_by_parent: usize) {
self.inner.negotiate_samplerate(requested_by_parent)
}
} Then the output calls We would need to add an api that allows specifying a different sample rate. That could nicely fit into the struct FixedSampleRate<S>{
rate: usize,
inner: S,
}
impl Source for FixedSampleRate {
fn negotiate_samplerate(&mut self, requested_by_parent: usize) {
debug!("FixedSampleRate set, ignoring samplerate requested by parent")
self.inner.negotiate_samplerate(self.rate)
}
}
trait Source {
fn with_samplerate(self, rate: usize) -> Self {
FixedSampleRate {
rate,
inner: self,
}
}
} This might even allow us to improve mixer (which currently resamples both inputs) to use one sample-rate without resampling if possible. For example when mixing a Sine wave or Noise with a decoder (song playing from disk). |
The different sample rates are a problem, especially since resampling incurs needless performance loses and introduces artifacts. I do not think that exporting a constant is the proper solution here though.
Concluding, I would say this is a problem worth fixing so lets fix it properly. Your proposal to 'query current preferred sample rate' is far superior to me. It does not only solve the named problems, it allows us to make the sample_rate choice optional! That in itself is worth the effort for it makes rodio far simpler to use. |
related: #208 |
The intention for the constant was to document the state of affairs (see default stream initialization), It may not make sense for v1.0. I hoped that fully qualified name |
Empty callback produces nothing but receiving side may not know this unless it looks at size_hint. Yes, in this case sample rate is not very useful since it can be optimized by using |
I want to play with this soon given it might also be an answer to #493 (having empty get the channel count instructed from the parent/mixer its part of). |
I'd prefer to have a constructor or builder to handle that since it allows users to have complete control over it if they want. Something like // Or maybe can use rodio::stream::StreamConfig here instead.
struct StreamShape {
sample_rate: u32,
channels: u8
}
impl ASource {
pub fn new(params: &ASourceConf, conf: &StreamShape) -> ASource {
todo!()
}
pub fn connect_new(params: &ASourceConf, sink: &mut Sink) {
mixer.add(Self::new(mixer.input_shape()));
}
} |
Complete control is still possible, in principle every source should try to adhere to whatever its parent/predecessor/sink/wrapping source (we really got to note down what we call this) decided. But a source can decide to pass on its own values. We can add a source that does this with a value given by the end user. That is what I meant here: #646 (comment) You could use that like this: let sine = Signalgenerator(Sine);
let source = Decoder::new("some path.mp3").pausable().amplify().mix(sine).with_preferred_samplerate(44_100).with_fixed_channels(2);
stream.mix(source); The sine wave sample_rate will be 44_100 even though the stream might have tried to negotiate a 96_000 sample_rate. It is 44_100 because that is what |
a more advanced design could present multiple sample-rates during negotiation. In some order of preference. Maybe then the sources could negotiate with the playback hardware and find the optimal sample rate. You recently reworked the api for that, do you think that is possible? |
I'd prefer to have sample rate as a construction parameter. In case a source needs frames/chunks with output formats varying over time, that can be plugged into Given your example a stream parameters can be passed as: let output = OutputStreamBuilder::open_default_stream();
// Can be "shape" instead of "format", maybe. I like the word but frequency
// is not really a shape...
// Note that here is only one conversion step.
let source = Decoder::new("file.mp3")
.format(output.format()) // Inserts UniformSourceIterator
// All subsequent steps can rely that their input format will not change.
.pausable().amplify(0.5).mix(Signalgenerator(Sine));
output.mix(source); A bit inconvenient, maybe, but the parameter setting on each step is automatic. Each step's constructor knows format of its inputs and all the linking can be checked at construction time. So when input and output format do not match, an error can be returned immediately. It can be done in reverse, maybe more convenient but more complex, especially for sinks that have dynamic set of inputs. |
CPAL does give a set of possible audio formats so in theory one can calculate intersection between one's preferences and what is available, but I suspect it may be a bit complicated. |
I like to try, it would be the ultimate solution right? We would minimize unneeded re-sampling and the user would not even notice (except less artifacts more perf). One hard thing might be decoders, they might change samplerate mid song. The negotiation might need to happen multiple times to support that.... |
@dvdsk Yes, to get ultimate reduction in conversion one can do that from the both ends, enumerate all the inputs and see what output device can do and then pick some stream format that satisfies them all, but it will not always be possible. As I see it requiring inputs to convert to some common format and then just using that would be a meaningful improvement already. Alternatively if inputs are all of the same format then, use that and add a converter only at the end of the chain just before the output stream only if necessary. I believe all the sources that do change format are somehow external to rodio, are its inputs or close to inputs. That is why as I mentioned earlier I am for having also simpler |
CPAL format enumeration is cumbersome so some sort of query function could be welcome (when we can filter formats with some criteria or a predicate). And ideally it should be in CPAL not in rodio. A crazy implementation would insert all formats into SQLite and query that.... |
hope your okay with the rename, you and I have said some valuable stuff here and I want it easier to find. Thats why I renamed it |
from the user stories thread
from hodaun readme, section differences to rodio:
That sounds like my proposal here: #646 (comment). and this:
sounds like what you proposed. |
Yes, I'd also like to pass preferred stream parameters to a source during construction. I do not like the "negotiation" part (if I understand it correctly), that is potentially changing parameters at some point later after construction is complete. I am afraid that may complicate the logic a lot. Sources that do need to change stream parameters dynamically can be adapted with a dedicated format converter that then produces consistent output. |
Regarding Hodaun, I would also like to process sample frames instead. At least it should be a strict requirement that source always produces whole frames (does not end after passing only some of the channels). |
It is complex but it would allow us to eliminate resampling when the hardware can take care of it for us. I think the complexity is not that high rodio's sources are a simple tree. The negotiation does not need to be perfect, if it can cover the most common case (song playing through decoder) I think it might be worth it already. Not having to resample will save quite some battery on low power devices. |
I'll work on it once we have a better resampler in rodio, and we can always decide not te merge. But I think we need to see how it would work out before deciding if this is something we want in rodio. I only have a vague plan right now and most of the complexity will only become clear once I start implementing it. |
@dvdsk The formats are "suggested", so to say, by both sources and the output device. A source's format can change but stream has to be re-initialized to change that. In my opinion the approach when explicit stream format is passed to the sources is the best (see #646 (comment)). Chasing perfect score does not seem worth it as dynamic changes will complicate the logic. It is not about topology but difficulties similar to handling I may change my opinion if I see the prototype, maybe I misunderstand something. |
As a side note it is possible to have a generic adapter that helps for any source to handle |
I do not really get how that would work, is it something you can easily demonstrate with example code? |
I did not know that, it makes senses though.
you are completely right here. Perfect is the enemy of good. I think if we do this, we should do it as a slight optimization on the current situation. Most sources are a constant sample rate we just do not know the rate at compile time. So if we do a single negotiation at creation we can determine a good |
@dvdsk I just tried to write such an adapter for processing spans and it would take some time for me to see how to do this in a generic way. I'll try to have something that at least builds :) Although you can look at |
@dvdsk Yes, now I admit that in Rust such an adapter will be a lot more complicated then I thought or maybe I do not understand Rust well enough yet. In a garbage collected language this would be more straightforward. The filter that exists only for a span should be given temporary source instance since it is supposed to own it, and from other side of it there should be some source that can be dynamically attached to another source (when span changes). So I stopped half way :) Maybe will give it another try if I get a clearer idea. |
That is what prototyping is for, you can not figure out everything beforehand, hope you had some fun along the way. |
This should help to reduce number of sample rate conversions. Setting it to current de-facto default (44.1kHz).
There is still a question whether we should use
cpal::SampleRate
or plain integer. With either choice it is better to use the same type everywhere in the API. Currently some functions acceptcpal::SampleRate
while others useu32
.cpal::SampleRate
helps with type-checks but does not really provide useful services.Also partially relieves #208. The rate converter has to be fixed still but with default sample rate less resampling is needed.