Skip to content
This repository has been archived by the owner on Dec 12, 2022. It is now read-only.

Commit

Permalink
updating service alert with higher wait time (5m) (#25)
Browse files Browse the repository at this point in the history
  • Loading branch information
Luca Venturelli authored Sep 16, 2019
1 parent 783395d commit 8f1c32d
Showing 1 changed file with 4 additions and 5 deletions.
9 changes: 4 additions & 5 deletions monitoring/templates/alert.rules
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@ groups:
- name: basic
rules:

# Alert for any instance that is unreachable for >2 minutes.
# Alert for any instance that is unreachable for >5 minutes.
- alert: service_down
expr: 100 * (count(up == 0) BY (job) / count(up) BY (job)) > 10
for: 2m
expr: 100 * (count(up == 0) BY (job) / count(up) BY (job)) > 5
for: 5m
labels:
severity: critical
annotations:
summary: "Instance {{ $labels.instance }} down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 2 minutes."
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes."

- alert: high_load
expr: node_load1 > 2
Expand Down Expand Up @@ -66,4 +66,3 @@ groups:
${elasticsearch_rules}
${elasticsearch_additional_rules}
${custom_alert_rules}

0 comments on commit 8f1c32d

Please sign in to comment.