-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running k6 with high number of VUs overloads InfluxDB #1060
Comments
This is not possible at the moment, but I'll leave this issue open, to serve as a feature request for metric aggregation in k6 for different outputs.
Unfortunately not... PRs welcome: https://github.com/loadimpact/k6/blob/2a2fe2cc665e0d2b818c4f3ca7ce4fc9a5821294/stats/influxdb/collector.go#L34-L36 As a workaround until we have aggregation in InfluxDB, you could probably use telegraf. It seems to be able to accept data via the InfluxDB API, so k6 should be able to directly send metrics to it. And it also seems to support some aggregation as well: https://github.com/influxdata/telegraf#aggregator-plugins As a somewhat connected issue, #570 might interest you. Currently, k6 always emits all metrics it measures, which as you've seen, can be quite a lot of data. We want to add a way to filter-out those metrics you're not interested in, so follow that issue for updates on the topic. |
I'm struggling with this right now, even a smallish 120 VU test absolutely hammers InfluxDB, both in terms of load and data points (quickly hitting the dreaded Is there some way to sub-sample the metrics on high volume tests? |
@benc-uk, sorry for the late response. Unfortunately there's not a lot that can be currently done before we implement #1321 or generic metric aggregation. You can try tweaking the k6 pushing behavior by setting As an workaround, you could try exporting the raw metrics to a gzipped csv or json file and then sending that data in InfluxDB with a small script at a more sedate pace after the k6 script has finished. |
Thanks for confirming. I found the "hidden" K6_INFLUXDB options, but like you say, they don't help that much. For our next project we anticipate running long and very high VU tests, and the volume of data it pushes into InfluxDB or even CSV without some pre-aggregation or sub-sampling is just going to be unmanageable. For now we're going to use the post test summary data, and monitor the requests and other data points another way (from the backend) |
I haven't done it, but it shouldn't be very hard. You can use the raw line protocol from pretty much any language, even |
Now that k6 v0.42.0 supports the Prometheus remote write as an experimental output, it can be used to mitigate this issue. InfluxBDv2 doesn't directly support it, but it is possible to replicate the same concept using the Prometheus remote write output with Telegraf. |
As the previous comments clearly explain, the main issue would require an extensive refactoring of the current output. I'm closing this issue as we don't plan any significant improvement related to InfluxDBv1 output. Most of our resources regarding outputs will be probably used for addressing #2557 which seems the best option for the big tent philosophy. |
I have a problem running k6 with 10 0000 vus. It runs fine without the output, but when I try to use InfluxDB to do some analytics, k6 generates a bit less rps, also it writes so much data to InfluxDB, so that it overloads. Is to possible to configure InfluxDB output, so that it would aggregate the data before sending it?
We're also using Yandex.Tank for simpler scenarios, it can generate even bigger load, but somehow it generates logs, that are 10 times lesser than k6 logs and it sends them not as frequently as k6. Can I somehow configure k6 to send data less frequently?
The text was updated successfully, but these errors were encountered: