Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add datadog.cluster_agent.leader_election.is_leader #19308

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

keisku
Copy link
Contributor

@keisku keisku commented Dec 25, 2024

What does this PR do?

Datadog Cluster Agent exposes leader_election_is_leader.

https://github.com/DataDog/datadog-agent/blob/c7f4f4e228f546dfb60fdceb6cb6ae92a14bca88/pkg/util/kubernetes/apiserver/leaderelection/metrics/metrics.go#L24-L32

Motivation

This metric is useful for troubleshooting the issue regarding leader election.

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
  • If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged

Copy link

codecov bot commented Dec 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.63%. Comparing base (642b2f9) to head (c2c5bcd).
Report is 8 commits behind head on master.

Additional details and impacted files
Flag Coverage Δ
activemq ?
cassandra ?
datadog_cluster_agent 90.19% <ø> (ø)
hive ?
hivemq ?
hudi ?
ignite ?
jboss_wildfly ?
kafka ?
presto ?
solr ?

Flags with carried forward coverage won't be shown. Click here to find out more.

@keisku keisku force-pushed the keisku/cluster-agent-metrics branch from 7a42a1d to fc74e60 Compare December 25, 2024 09:27
Copy link
Contributor

@clamoriniere clamoriniere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment to suggest another approach to have a "is leader" metric easier to use.

Also it would be nice to add the metric in the OOTB Dashboard too.

@@ -68,6 +68,7 @@
'kubernetes_apiserver_kube_events': 'kubernetes_apiserver.kube_events',
'language_detection_dca_handler_processed_requests': 'language_detection_dca_handler.processed_requests',
'language_detection_patcher_patches': 'language_detection_patcher.patches',
'leader_election_is_leader': 'leader_election.is_leader',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this metric wasn't designed to be exposed directly.
Instead it was added to be use as "label join", see:

'label_joins': {
'leader_election_is_leader': {
'labels_to_match': ['*'],
'labels_to_get': ['is_leader'],
}
},

IMO, if we want to make a metric that exposes the leader status we should use the metric value (1 if leader, 0 if follower) instead of the using a label.
Maybe it can be done in this check thanks to a metric transformer function that will convert the is_leader label's value to a metric value.

you can find some transformer example in other openmetrics base check like

Copy link
Contributor Author

@keisku keisku Dec 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, if we want to make a metric that exposes the leader status we should use the metric value (1 if leader, 0 if follower) instead of the using a label.

What do you think if I achieve that in Cluster Agent side? DataDog/datadog-agent#32511

We could keep datadog cluster check(python) simple with approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it would be nice to add the metric in the OOTB Dashboard too.

Will do!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember if the join works if the value is not 1, there might a special handling of these series (always 1 with only tags)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants