Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EBPF-592] gpu: collect NVML metrics in agent check #30485

Merged
merged 6 commits into from
Nov 4, 2024

Conversation

gjulianm
Copy link
Contributor

@gjulianm gjulianm commented Oct 25, 2024

What does this PR do?

This PR integrates the NVML metrics collectors in the agent check.

Motivation

Describe how to test/QA your changes

Included in e2e tests

Possible Drawbacks / Trade-offs

Additional Notes

@gjulianm gjulianm self-assigned this Oct 25, 2024
@gjulianm gjulianm added changelog/no-changelog qa/done QA done before merge and regressions are covered by tests labels Oct 25, 2024
Copy link

cit-pr-commenter bot commented Oct 25, 2024

Go Package Import Differences

Baseline: 0a57140
Comparison: 57b0ddf

binaryosarchchange
agentlinuxamd64
+3, -0
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/gpu/nvidia
+github.com/NVIDIA/go-nvml/pkg/dl
+github.com/NVIDIA/go-nvml/pkg/nvml
agentlinuxarm64
+3, -0
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/gpu/nvidia
+github.com/NVIDIA/go-nvml/pkg/dl
+github.com/NVIDIA/go-nvml/pkg/nvml
iot-agentlinuxamd64
+3, -0
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/gpu/nvidia
+github.com/NVIDIA/go-nvml/pkg/dl
+github.com/NVIDIA/go-nvml/pkg/nvml
iot-agentlinuxarm64
+3, -0
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/gpu/nvidia
+github.com/NVIDIA/go-nvml/pkg/dl
+github.com/NVIDIA/go-nvml/pkg/nvml
heroku-agentlinuxamd64
+3, -0
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/gpu/nvidia
+github.com/NVIDIA/go-nvml/pkg/dl
+github.com/NVIDIA/go-nvml/pkg/nvml
cluster-agentlinuxamd64
+3, -0
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/gpu/nvidia
+github.com/NVIDIA/go-nvml/pkg/dl
+github.com/NVIDIA/go-nvml/pkg/nvml
cluster-agentlinuxarm64
+3, -0
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/gpu/nvidia
+github.com/NVIDIA/go-nvml/pkg/dl
+github.com/NVIDIA/go-nvml/pkg/nvml

@gjulianm gjulianm force-pushed the guillermo.julian/nvml-metrics-agentcheck branch from 4386404 to 660683c Compare October 25, 2024 11:44
Copy link

cit-pr-commenter bot commented Oct 25, 2024

Regression Detector

@gjulianm gjulianm force-pushed the guillermo.julian/nvml-metrics-agentcheck branch from 660683c to c63b415 Compare October 30, 2024 15:06
@github-actions github-actions bot added the short review PR is simple enough to be reviewed quickly label Oct 30, 2024
@agent-platform-auto-pr
Copy link
Contributor

agent-platform-auto-pr bot commented Oct 30, 2024

Test changes on VM

Use this command from test-infra-definitions to manually test this PR changes on a VM:

inv create-vm --pipeline-id=48137588 --os-family=ubuntu

Note: This applies to commit 57b0ddf

@gjulianm gjulianm marked this pull request as ready for review October 31, 2024 11:54
@gjulianm gjulianm requested a review from a team as a code owner October 31, 2024 11:54
Copy link
Contributor

@val06 val06 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed

test/new-e2e/tests/gpu/gpu_test.go Show resolved Hide resolved
pkg/collector/corechecks/gpu/gpu.go Outdated Show resolved Hide resolved
pkg/collector/corechecks/gpu/gpu.go Outdated Show resolved Hide resolved
@github-actions github-actions bot added medium review PR review might take time and removed short review PR is simple enough to be reviewed quickly labels Oct 31, 2024
@gjulianm
Copy link
Contributor Author

gjulianm commented Nov 4, 2024

/merge -@ today at 13:00 CET

@dd-devflow
Copy link

dd-devflow bot commented Nov 4, 2024

🚂 MergeQueue: pull request scheduled for Mon, 04 Nov 2024 12:00:00 UTC

Pull Request scheduled to be added to the queue on Mon, 04 Nov 2024 12:00:00 UTC

Use /merge -c to cancel this operation!

@dd-devflow
Copy link

dd-devflow bot commented Nov 4, 2024

🚂 MergeQueue: pull request added to the queue

The median merge time in main is 22m.

Use /merge -c to cancel this operation!

@dd-mergequeue dd-mergequeue bot merged commit 72940a1 into main Nov 4, 2024
214 checks passed
@dd-mergequeue dd-mergequeue bot deleted the guillermo.julian/nvml-metrics-agentcheck branch November 4, 2024 12:27
@github-actions github-actions bot added this to the 7.61.0 milestone Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/no-changelog medium review PR review might take time qa/done QA done before merge and regressions are covered by tests team/ebpf-platform
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants