Dealing with common alert name from multiple Prometheus instances in "alerts" #7680
Replies: 1 comment
-
Hey @anuragbhatia! I'm here to help you with any bugs, questions, or contributions you have. Let's tackle this issue together! I found 2 similar open issues that might be relevant to your problem:
To deal with the issue of local common rules duplication in a distributed Prometheus setup with Thanos, where multiple alerts with the same name appear in the Thanos frontend "alerts" section, you can use the The deduplication logic is as follows:
Here is an example of how the func TestDedupRules(t *testing.T) {
for _, tc := range []struct {
name string
rules, want []*rulespb.Rule
replicaLabels []string
}{
// Test cases here...
} {
t.Run(tc.name, func(t *testing.T) {
got := DedupRules(tc.rules, tc.replicaLabels)
testutil.Equals(t, tc.want, got)
})
}
} Additionally, the Thanos Querier can be configured with the |
Beta Was this translation helpful? Give feedback.
-
I have a distributed setup with bunch of Prometheus instances running in each failure domain (as often suggested) with Thanos sidecar + object storage offload. In this system alerts via ruler seem simple/quick but makes "ruler" the single point failure. If I put local rules in each prometheus like say "alert if storage is at 80% level" then it works for purpose of alerting with no single point failures but I see multiple alerts with same name in Thanos frontend "alerts" section.
Wonder how do you deal with local common rules duplication issue?
Beta Was this translation helpful? Give feedback.
All reactions