Skip to content

Commit

Permalink
Update upstream alerts
Browse files Browse the repository at this point in the history
  • Loading branch information
github-actions[bot] committed Aug 21, 2024
1 parent 208caa0 commit 8979b08
Show file tree
Hide file tree
Showing 10 changed files with 156 additions and 64 deletions.
28 changes: 0 additions & 28 deletions component/extracted_alerts/master/collector_prometheus_alerts.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,34 +17,6 @@ spec:
labels:
service: collector
severity: critical
- alert: CollectorHighErrorRate
annotations:
message: "{{ $value }}% of records have resulted in an error by {{ $labels.namespace }}/{{ $labels.pod }} collector component."
summary: "{{ $labels.namespace }}/{{ $labels.pod }} collector component errors are high"
expr: |
100 * (
collector:log_num_errors:sum_rate{app_kubernetes_io_part_of = "cluster-logging"}
/
collector:received_events:sum_rate{app_kubernetes_io_part_of = "cluster-logging"}
) > 0.001
for: 15m
labels:
service: collector
severity: critical
- alert: CollectorVeryHighErrorRate
annotations:
message: "{{ $value }}% of records have resulted in an error by {{ $labels.namespace }}/{{ $labels.pod }} collector component."
summary: "{{ $labels.namespace }}/{{ $labels.pod }} collector component errors are very high"
expr: |
100 * (
collector:log_num_errors:sum_rate{app_kubernetes_io_part_of = "cluster-logging"}
/
collector:received_events:sum_rate{app_kubernetes_io_part_of = "cluster-logging"}
) > 0.05
for: 15m
labels:
service: collector
severity: critical
- alert: ElasticsearchDeprecation
annotations:
message: "In Red Hat OpenShift Logging Operator 6.0, support for the Red Hat Elasticsearch Operator has been removed. Bug fixes and support are provided only through the end of the 5.9 lifecycle. As an alternative to the Elasticsearch Operator, you can use the Loki Operator instead."
Expand Down
18 changes: 18 additions & 0 deletions component/extracted_alerts/master/lokistack_prometheus_alerts.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,24 @@ groups:
for: 15m
labels:
severity: warning
- alert: LokiDiscardedSamplesWarning
annotations:
message: |-
Loki in namespace {{ $labels.namespace }} is discarding samples in the "{{ $labels.tenant }}" tenant during ingestion.
Samples are discarded because of "{{ $labels.reason }}" at a rate of {{ .Value | humanize }} samples per second.
summary: Loki is discarding samples during ingestion because they fail validation.
runbook_url: "[[ .RunbookURL]]#Loki-Discarded-Samples-Warning"
expr: |
sum by(namespace, tenant, reason) (
irate(loki_discarded_samples_total{
reason!="rate_limited",
reason!="per_stream_rate_limit",
reason!="stream_limit"}[2m])
)
> 0
for: 15m
labels:
severity: warning
- alert: LokistackSchemaUpgradesRequired
annotations:
message: |-
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -175,3 +175,21 @@ groups:
for: 15m
labels:
severity: warning
- alert: LokiDiscardedSamplesWarning
annotations:
message: |-
Loki in namespace {{ $labels.namespace }} is discarding samples in the "{{ $labels.tenant }}" tenant during ingestion.
Samples are discarded because of "{{ $labels.reason }}" at a rate of {{ .Value | humanize }} samples per second.
summary: Loki is discarding samples during ingestion because they fail validation.
runbook_url: "[[ .RunbookURL]]#Loki-Discarded-Samples-Warning"
expr: |
sum by(namespace, tenant, reason) (
irate(loki_discarded_samples_total{
reason!="rate_limited",
reason!="per_stream_rate_limit",
reason!="stream_limit"}[2m])
)
> 0
for: 15m
labels:
severity: warning
Original file line number Diff line number Diff line change
Expand Up @@ -175,3 +175,21 @@ groups:
for: 15m
labels:
severity: warning
- alert: LokiDiscardedSamplesWarning
annotations:
message: |-
Loki in namespace {{ $labels.namespace }} is discarding samples in the "{{ $labels.tenant }}" tenant during ingestion.
Samples are discarded because of "{{ $labels.reason }}" at a rate of {{ .Value | humanize }} samples per second.
summary: Loki is discarding samples during ingestion because they fail validation.
runbook_url: "[[ .RunbookURL]]#Loki-Discarded-Samples-Warning"
expr: |
sum by(namespace, tenant, reason) (
irate(loki_discarded_samples_total{
reason!="rate_limited",
reason!="per_stream_rate_limit",
reason!="stream_limit"}[2m])
)
> 0
for: 15m
labels:
severity: warning
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,24 @@ groups:
for: 15m
labels:
severity: warning
- alert: LokiDiscardedSamplesWarning
annotations:
message: |-
Loki in namespace {{ $labels.namespace }} is discarding samples in the "{{ $labels.tenant }}" tenant during ingestion.
Samples are discarded because of "{{ $labels.reason }}" at a rate of {{ .Value | humanize }} samples per second.
summary: Loki is discarding samples during ingestion because they fail validation.
runbook_url: "[[ .RunbookURL]]#Loki-Discarded-Samples-Warning"
expr: |
sum by(namespace, tenant, reason) (
irate(loki_discarded_samples_total{
reason!="rate_limited",
reason!="per_stream_rate_limit",
reason!="stream_limit"}[2m])
)
> 0
for: 15m
labels:
severity: warning
- alert: LokistackSchemaUpgradesRequired
annotations:
message: |-
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,27 @@ spec:
severity: warning
syn: 'true'
syn_component: openshift4-logging
- alert: SYN_LokiDiscardedSamplesWarning
annotations:
message: |-
Loki in namespace {{ $labels.namespace }} is discarding samples in the "{{ $labels.tenant }}" tenant during ingestion.
Samples are discarded because of "{{ $labels.reason }}" at a rate of {{ .Value | humanize }} samples per second.
runbook_url: '[[ .RunbookURL]]#Loki-Discarded-Samples-Warning'
summary: Loki is discarding samples during ingestion because they fail
validation.
expr: |
sum by(namespace, tenant, reason) (
irate(loki_discarded_samples_total{
reason!="rate_limited",
reason!="per_stream_rate_limit",
reason!="stream_limit"}[2m])
)
> 0
for: 15m
labels:
severity: warning
syn: 'true'
syn_component: openshift4-logging
- alert: SYN_LokistackSchemaUpgradesRequired
annotations:
message: |-
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,27 @@ spec:
severity: warning
syn: 'true'
syn_component: openshift4-logging
- alert: SYN_LokiDiscardedSamplesWarning
annotations:
message: |-
Loki in namespace {{ $labels.namespace }} is discarding samples in the "{{ $labels.tenant }}" tenant during ingestion.
Samples are discarded because of "{{ $labels.reason }}" at a rate of {{ .Value | humanize }} samples per second.
runbook_url: '[[ .RunbookURL]]#Loki-Discarded-Samples-Warning'
summary: Loki is discarding samples during ingestion because they fail
validation.
expr: |
sum by(namespace, tenant, reason) (
irate(loki_discarded_samples_total{
reason!="rate_limited",
reason!="per_stream_rate_limit",
reason!="stream_limit"}[2m])
)
> 0
for: 15m
labels:
severity: warning
syn: 'true'
syn_component: openshift4-logging
- alert: SYN_LokistackSchemaUpgradesRequired
annotations:
message: |-
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,42 +23,6 @@ spec:
severity: critical
syn: 'true'
syn_component: openshift4-logging
- alert: SYN_CollectorHighErrorRate
annotations:
message: '{{ $value }}% of records have resulted in an error by {{ $labels.namespace
}}/{{ $labels.pod }} collector component.'
summary: '{{ $labels.namespace }}/{{ $labels.pod }} collector component
errors are high'
expr: |
100 * (
collector:log_num_errors:sum_rate{app_kubernetes_io_part_of = "cluster-logging"}
/
collector:received_events:sum_rate{app_kubernetes_io_part_of = "cluster-logging"}
) > 0.001
for: 15m
labels:
service: collector
severity: critical
syn: 'true'
syn_component: openshift4-logging
- alert: SYN_CollectorVeryHighErrorRate
annotations:
message: '{{ $value }}% of records have resulted in an error by {{ $labels.namespace
}}/{{ $labels.pod }} collector component.'
summary: '{{ $labels.namespace }}/{{ $labels.pod }} collector component
errors are very high'
expr: |
100 * (
collector:log_num_errors:sum_rate{app_kubernetes_io_part_of = "cluster-logging"}
/
collector:received_events:sum_rate{app_kubernetes_io_part_of = "cluster-logging"}
) > 0.05
for: 15m
labels:
service: collector
severity: critical
syn: 'true'
syn_component: openshift4-logging
- alert: SYN_ElasticsearchDeprecation
annotations:
message: In Red Hat OpenShift Logging Operator 6.0, support for the Red
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,27 @@ spec:
severity: warning
syn: 'true'
syn_component: openshift4-logging
- alert: SYN_LokiDiscardedSamplesWarning
annotations:
message: |-
Loki in namespace {{ $labels.namespace }} is discarding samples in the "{{ $labels.tenant }}" tenant during ingestion.
Samples are discarded because of "{{ $labels.reason }}" at a rate of {{ .Value | humanize }} samples per second.
runbook_url: '[[ .RunbookURL]]#Loki-Discarded-Samples-Warning'
summary: Loki is discarding samples during ingestion because they fail
validation.
expr: |
sum by(namespace, tenant, reason) (
irate(loki_discarded_samples_total{
reason!="rate_limited",
reason!="per_stream_rate_limit",
reason!="stream_limit"}[2m])
)
> 0
for: 15m
labels:
severity: warning
syn: 'true'
syn_component: openshift4-logging
- alert: SYN_LokistackSchemaUpgradesRequired
annotations:
message: |-
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,27 @@ spec:
severity: warning
syn: 'true'
syn_component: openshift4-logging
- alert: SYN_LokiDiscardedSamplesWarning
annotations:
message: |-
Loki in namespace {{ $labels.namespace }} is discarding samples in the "{{ $labels.tenant }}" tenant during ingestion.
Samples are discarded because of "{{ $labels.reason }}" at a rate of {{ .Value | humanize }} samples per second.
runbook_url: '[[ .RunbookURL]]#Loki-Discarded-Samples-Warning'
summary: Loki is discarding samples during ingestion because they fail
validation.
expr: |
sum by(namespace, tenant, reason) (
irate(loki_discarded_samples_total{
reason!="rate_limited",
reason!="per_stream_rate_limit",
reason!="stream_limit"}[2m])
)
> 0
for: 15m
labels:
severity: warning
syn: 'true'
syn_component: openshift4-logging
- alert: SYN_LokistackSchemaUpgradesRequired
annotations:
message: |-
Expand Down

0 comments on commit 8979b08

Please sign in to comment.