-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add/non slo redis alerts #256
Conversation
I get why this could be a reason to go with the specific metrics but here's an idea: If we use the kube-state-metrics like with PostgreSQL for memory and storage then we can put the generation of those rules into jsonnet functions and just parameterize them. Like what I did for the sloth rules. Then we won't run into the risk of copy/paste errors, and we know all the metrics are from the same source. If we ever need to adjust them, we can do it in one place. |
Ok, we could abstract that, but we still need to enable additional metrics for both customers and us, especially if we hit clusters monitoring, I'm flexible with both solutions - @TheBigLee be the tiebreaker :D Those are rules I was talking about, there are few pages aggregating Prometheus rules and they all reuse those additional metrics, this is why I think it's worth enabling them |
Sure, I'm not against enabling the additional metrics, they will be useful for additional alerts and dashboards. I'm just saying we should use the kube-state-metrics for storage and memory, as they are service agnostic. |
component/tests/golden/vshn/appcat/appcat/21_composition_vshn_postgresrestore.yaml
Outdated
Show resolved
Hide resolved
f218b9b
to
51c7b6d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rendered composition looks a bit broken. Also, before merging this test it on the lab and also check if the existing dashboards on insights need adjustments.
78075db
to
70b68d2
Compare
7d03326
to
c566519
Compare
c566519
to
26a4794
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some things look a bit broken in the rendered yaml files.
Also please test it on the lab to see if something in the dashboards would also need fixing.
component/tests/golden/vshn/appcat/appcat/21_composition_vshn_postgres.yaml
Outdated
Show resolved
Hide resolved
component/tests/golden/vshn/appcat/appcat/21_composition_vshn_postgres.yaml
Show resolved
Hide resolved
component/tests/golden/vshn/appcat/appcat/21_composition_vshn_postgres.yaml
Outdated
Show resolved
Hide resolved
component/tests/golden/vshn/appcat/appcat/21_composition_vshn_postgres.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As Simon already pointed out there are changes in postgres files which should not be there. This PR should not change anything in Postgres, at best only the names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes are requested
30d4af9
to
b725c98
Compare
This has been fixed now |
Signed-off-by: Nicolas Bigler <nicolas.bigler@vshn.ch>
b725c98
to
5adf8b8
Compare
What do you mean? The dashboards are not affected by this, as the dashboards use the metrics and not the alertRules. And nothing changed with the metrics themselves. |
I've refactored the prometheus code a bit and put it into a separate file and also added the missing runbook for the |
1930004
to
f40f821
Compare
Signed-off-by: Nicolas Bigler <nicolas.bigler@vshn.ch>
f40f821
to
11fcebe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I haven't tested it.
Enabling Redis System Metrics, because of:
REDIS_EXPORTER_INCL_SYSTEM_METRICS | Whether to include system metrics like total_system_memory_bytes, defaults to false.
Redis Storage was simply copied from PostgreSQL storage