-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reliably targeting some RPS, and in stages? #120
Comments
Hi @shibrady,
How do you come to this conclusion? |
I haven't actually tried using these executors out, but as an example (that I think gets the point across):
If my goal is to output some If I understood my SUT well enough to e.g. define SLOs, then I think this might be ok - e.g. I could say that I expect p99 to be < 500ms, provision
We don't use thresholds at the moment and just aggregate metrics through another independent service. EDIT: Clarified some of my concerns after better understanding that the |
Interesting. Neither k6 nor k6-operator are smart enough to find the best parameters for a given SUT on their own ("adaptively") and at least at the moment, I'd say they shouldn't be expected to. I.e. yes, you'll need to try out different values of parameters until you find what's most suitable for you. Your configuration makes sense, though it's not clear how much one pod can handle? If you know the max number of VUs possible per one pod and desired rate, then calculating both As for SUT's latency not being known well enough: as a suggestion, perhaps it make sense to try a more exploratory executors, e.g. ramping-arrival-rate, with one pod for simplicity (if possible), to understand how given SUT behaves under different loads? And then use those observations to decide on the parameters that would fit the chosen testing and deployment strategies. TBH, this sounds to me like a general issue of how to find fitting implementation of k6 script in the given case. If you know what load one pod can reliably handle and what kind of load you want to achieve, it's straight-forward to calculate the numbers with k6-operator. IMHO, it'd be really nice to have a system that could be self-adapting and self-determining parameters but that's quite a broad and complex problem which is currently outside of k6's or k6-operator's scope, AFAIK 😄 |
In theory, this kind of distributed feature can be easily achieved with "leaky bucket"/"rate limiting" family mecanisms. This could be both managed by a third party or directly inside the workers using gossiping. Technically the most simple implementation would somehow be based on a real tiny redis or memcached dedicated to this purpose. This tiny instance would be filled by all parallel workers until the bucket is full ... waiting for the next leak to create a new spot for another VU. https://www.mikeperham.com/2020/11/09/the-leaky-bucket-rate-limiter/
Another way to solve this is would be to make the operator handling such leaking/rate limiting calls through its own API (but this API must be really resilient to load & qps that could be easily handled by redis or memcached) |
That makes sense to me. I agree life would be simpler knowing more about the SUT in this case (and maybe that's the direction we go), but it's good to clear up what's possible in the context of k6! |
Sorry for the delay! @shibrady I'm glad that our discussion cleared up the situation 🎉 From my side, it also gave me an indicator that it would probably be good to add some clarity in the docs about different executors in context of k6-operator test runs. @BarthV I'm not sure this fits the use case of finding the RPS but it's definitely an interesting idea, thanks for sharing! Leaky bucket sounds very similar to what Another interesting question here is whether it makes sense to mess with executor's level logic of k6... e.g. implement a separate ~ distributed |
Closing this issue as initial question appears to have been resolved. There is now a separate issue for documenting this topic: |
Reading through grafana/k6#140 and seeing grafana/k6#2438 (although, not having grokked the changes) this might be a duplicate, but here goes...
I'd like to run distributed k6 instances to achieve some particular RPS (like arrival rate executors) given some iteration function (e.g. the default export). Ideally I'd like to be able to adjust this RPS target after certain durations (sort of like having multiple k6 Scenarios with different targets), and I'd like to be accurate (able to adapt to the SUT, like constant arrival rate executors).
It isn't clear to me how to reliably achieve this with k6 without knowing how many runners/VUs I will need ahead of time. We do currently run k6 on k8s distributed across several pods, but if I were to use this operator it seems like I would need to know how many runners I'd need ahead of time (
parallelism
), and how many VUs I should realistically be allocating per runner (although I'm admittedly not quite sure how--execution-segment
s work under the hood with constant arrival rate executors).Say for example my SUT's request latency is a constant 100ms and I wanted to test at 100k QPS - I presume I'd need some ~10000 VUs, which I may not be able to reliably run on 1 k6 instance due to lack of resources (we have some somewhat memory intensive k6 extensions going on). So if I run 1000 VUs per pod, I imagine I'd be able to reach 100k QPS with 10 pods. But if my SUT's performance degrades, I imagine I'd need more VUs; I could imagine giving some headroom per pod with constant arrival rate executors (I'm assuming this works with
--execution-segment
s), but realistically there's only so much headroom I can give per pod, and I'd just need more k6 runners.(and realistically I may not know the expected latency ahead of time, and it may grow arbitrarily high mid-test)
This is fine when just thinking of one-off load tests where I'm able to go through trial and error, but seems less than ideal when wanting to automate this more generally. For now we have some tooling on our end to try and handle our use case (essentially a controller goes through trial and error, monitoring actual runner RPS and adjusting runner counts to meet a target), but this has had its flaws and we're looking into alternatives.
I'm wondering if there may be something I'm missing about k6/the k6 operator's features that make this easy, but if not, what the k6's teams thoughts are on this (or maybe grafana/k6#2438 is meant to address this?).
The text was updated successfully, but these errors were encountered: