Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Render agent status using status component #22115

Merged
merged 61 commits into from
Jan 25, 2024

Conversation

GustavoCaso
Copy link
Member

@GustavoCaso GustavoCaso commented Jan 17, 2024

What does this PR do?

Motivation

Create the different status providers to render the agent status output, and configure the run command FX dependencies.

Logic to decide where the status provider should live

If a component is responsible for that particular section of the status output, create the status provider using FX value groups. It would require adding or changing that particular component return value.

Example: Adding the status provider to the host metadata component

Example: Changing the return value of the forwarder component to include the status provider

If there is no component, we move the status provider to the pkg/status/* or inside the original package.

Example: Move SNMP provider to the pkg/snmp

Example: Create JMX status provider in pkg/status/jmx

Addition to the status component

Added support for Noop providers. These providers would be entirely ignored by the status component and would not render any information.

We needed to propagate the verbose argument information to the different providers.

Important considerations

  • For components or packages that expose the status provider, I have created a status.go within their folder. Once the pattern of how to add new status provider is fully established, I will add new documentation in the docs/component folder

  • The majority of the additions are because of the status templates. Those are similar to those in the pkg/status/render. To help with the review, filter those out

  • Only the agent status is using the status component. Process agent, apm agent, cluster agent and the GUI status are using the old way. We do this to minimize regressions. Since I'm migrating one status command at a time, I need to keep the other templates for now. So, the existing templates at pkg/status/render/templates/* can not be deleted yet, as I want the other status command to work in isolation from the agent status command. I will remove those once I migrate all status commands.

  • I've added some basics for all the new status providers. The test makes sure no unwanted panic or errors are returned. It would be ideal to have a test validating the result of each provider. I can only write some of the tests myself; it would take too much time. I would leave that exercise to the code owners of those files.

Describe how to test/QA your changes

PLEASE TAKE INTO CONSIDERATION THAT THE ORDER OF THE STATUS OUTPUT HAS CHANGED. When doing the QA, we need to focus on the same information being displayed and not too much on order.

Build the agent binary invoke agent.build --build-exclude=systemd

Test that the agent status CLI command returns the right information.

  • JSON format agent status -j
  • Pretty JSON format agent status -p
  • Text format agent status
  • Text format with the verbose flag agent status -v

The agent status displays information about other agent processes if they are running. For testing those permutations, you can build the other agent and run them + modify the agent configuration.

To build and run the process agent, follow this guide
To build the APM agent, run: inv -e trace-agent.build . To run the apm agent ./bin/trace-agent/trace-agent --config dev/dist/datadog.yaml

Here is an example of a configuration with logs, apm and process agent enabled:

api_key: <YOUR API KEY>
site: <YOUR SITE>

logs_enabled: true
apm_config:
  enabled: true
  apm_non_local_traffic: true
process_config:
  process_collection:
    enabled: true

Reviewer's Checklist

  • If known, an appropriate milestone has been selected; otherwise the Triage milestone is set.
  • Use the major_change label if your change either has a major impact on the code base, is impacting multiple teams or is changing important well-established internals of the Agent. This label will be use during QA to make sure each team pay extra attention to the changed behavior. For any customer facing change use a releasenote.
  • A release note has been added or the changelog/no-changelog label has been applied.
  • Changed code has automated tests for its functionality.
  • Adequate QA/testing plan information is provided. Except if the qa/skip-qa label, with required either qa/done or qa/no-code-change labels, are applied.
  • At least one team/.. label has been applied, indicating the team(s) that should QA this change.
  • If applicable, docs team has been notified or an issue has been opened on the documentation repo.
  • If applicable, the need-change/operator and need-change/helm labels have been applied.
  • If applicable, the k8s/<min-version> label, indicating the lowest Kubernetes version compatible with this feature.
  • If applicable, the config template has been updated.

Copy link

cit-pr-commenter bot commented Jan 17, 2024

Go Package Import Differences

Baseline: b0f3ede
Comparison: 6a69662

binaryosarchchange
agentlinuxamd64
+5, -0
+github.com/DataDog/datadog-agent/comp/dogstatsd/status/statusimpl
+github.com/DataDog/datadog-agent/comp/process/status/statusimpl
+github.com/DataDog/datadog-agent/comp/trace/status/statusimpl
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/net
+github.com/DataDog/datadog-agent/pkg/status/httpproxy
agentlinuxarm64
+5, -0
+github.com/DataDog/datadog-agent/comp/dogstatsd/status/statusimpl
+github.com/DataDog/datadog-agent/comp/process/status/statusimpl
+github.com/DataDog/datadog-agent/comp/trace/status/statusimpl
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/net
+github.com/DataDog/datadog-agent/pkg/status/httpproxy
agentwindowsamd64
+5, -0
+github.com/DataDog/datadog-agent/comp/dogstatsd/status/statusimpl
+github.com/DataDog/datadog-agent/comp/process/status/statusimpl
+github.com/DataDog/datadog-agent/comp/trace/status/statusimpl
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/net
+github.com/DataDog/datadog-agent/pkg/status/httpproxy
agentwindows386
+5, -0
+github.com/DataDog/datadog-agent/comp/dogstatsd/status/statusimpl
+github.com/DataDog/datadog-agent/comp/process/status/statusimpl
+github.com/DataDog/datadog-agent/comp/trace/status/statusimpl
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/net
+github.com/DataDog/datadog-agent/pkg/status/httpproxy
agentdarwinamd64
+5, -0
+github.com/DataDog/datadog-agent/comp/dogstatsd/status/statusimpl
+github.com/DataDog/datadog-agent/comp/process/status/statusimpl
+github.com/DataDog/datadog-agent/comp/trace/status/statusimpl
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/net
+github.com/DataDog/datadog-agent/pkg/status/httpproxy
agentdarwinarm64
+5, -0
+github.com/DataDog/datadog-agent/comp/dogstatsd/status/statusimpl
+github.com/DataDog/datadog-agent/comp/process/status/statusimpl
+github.com/DataDog/datadog-agent/comp/trace/status/statusimpl
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/net
+github.com/DataDog/datadog-agent/pkg/status/httpproxy
iot-agentlinuxamd64
+5, -0
+github.com/DataDog/datadog-agent/comp/dogstatsd/status/statusimpl
+github.com/DataDog/datadog-agent/comp/process/status/statusimpl
+github.com/DataDog/datadog-agent/comp/trace/status/statusimpl
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/net
+github.com/DataDog/datadog-agent/pkg/status/httpproxy
iot-agentlinuxarm64
+5, -0
+github.com/DataDog/datadog-agent/comp/dogstatsd/status/statusimpl
+github.com/DataDog/datadog-agent/comp/process/status/statusimpl
+github.com/DataDog/datadog-agent/comp/trace/status/statusimpl
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/net
+github.com/DataDog/datadog-agent/pkg/status/httpproxy
heroku-agentlinuxamd64
+5, -0
+github.com/DataDog/datadog-agent/comp/dogstatsd/status/statusimpl
+github.com/DataDog/datadog-agent/comp/process/status/statusimpl
+github.com/DataDog/datadog-agent/comp/trace/status/statusimpl
+github.com/DataDog/datadog-agent/pkg/collector/corechecks/net
+github.com/DataDog/datadog-agent/pkg/status/httpproxy
dogstatsdlinuxamd64
+1, -0
+github.com/DataDog/datadog-agent/comp/core/status
dogstatsdlinuxarm64
+1, -0
+github.com/DataDog/datadog-agent/comp/core/status
serverlesslinuxamd64
+1, -0
+github.com/DataDog/datadog-agent/comp/core/status
serverlesslinuxarm64
+1, -0
+github.com/DataDog/datadog-agent/comp/core/status
system-probelinuxamd64
+1, -0
+github.com/DataDog/datadog-agent/comp/core/status
system-probelinuxarm64
+1, -0
+github.com/DataDog/datadog-agent/comp/core/status
system-probewindowsamd64
+1, -0
+github.com/DataDog/datadog-agent/comp/core/status
trace-agentlinuxamd64
+1, -0
+github.com/DataDog/datadog-agent/comp/core/status
trace-agentlinuxarm64
+1, -0
+github.com/DataDog/datadog-agent/comp/core/status
trace-agentwindowsamd64
+1, -0
+github.com/DataDog/datadog-agent/comp/core/status
trace-agentwindows386
+1, -0
+github.com/DataDog/datadog-agent/comp/core/status
trace-agentdarwinamd64
+1, -0
+github.com/DataDog/datadog-agent/comp/core/status
trace-agentdarwinarm64
+1, -0
+github.com/DataDog/datadog-agent/comp/core/status
heroku-trace-agentlinuxamd64
+1, -0
+github.com/DataDog/datadog-agent/comp/core/status

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Traps status is not rendering anymore on my side, working fine in the GUI but not in the CLI 🤔
Conf to reproduce:

network_devices:
  namespace: test
  snmp_traps:
    enabled: true
    port: 9162
    community_strings:
      - public
    bind_host: 0.0.0.0

Copy link
Member Author

@GustavoCaso GustavoCaso Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for reporting the issue.

An issue existed when dynamically accessing the config object inside an fx.Supply function. I changed the way we dynamically provide the different status providers. Here is the commit 6a69662

This is an output using the configuration you provided

image

@GustavoCaso GustavoCaso force-pushed the render-agent-status-using-status-component branch from 2b20ac3 to 6a69662 Compare January 25, 2024 13:07
@GustavoCaso
Copy link
Member Author

/merge

@dd-devflow
Copy link

dd-devflow bot commented Jan 25, 2024

🚂 MergeQueue

Pull request added to the queue.

There are 4 builds ahead! (estimated merge in less than 1h)

Use /merge -c to cancel this operation!

@dd-mergequeue dd-mergequeue bot merged commit 1f83d84 into main Jan 25, 2024
170 of 171 checks passed
@dd-mergequeue dd-mergequeue bot deleted the render-agent-status-using-status-component branch January 25, 2024 17:26
@0intro 0intro self-requested a review January 29, 2024 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.