diff --git a/search/search_index.json b/search/search_index.json index 9b27530aeef89..bb2175cd2c8d1 100644 --- a/search/search_index.json +++ b/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Datadog Agent","text":"
Welcome to the wonderful world of developing the Datadog Agent. Here we document how we do things, advanced debugging techniques, coding conventions & best practices, the internals of our testing infrastructure, and so much more.
If you are intrigued, continue reading. If not, continue all the same
"},{"location":"#getting-started","title":"Getting started","text":"First, you'll want to set up your development environment.
"},{"location":"#agent-development-guidelines","title":"Agent development guidelines","text":"To know more about the general design of the Agent and how to add code and feature read our section on Components.
"},{"location":"#navigation","title":"Navigation","text":"Desktop readers can use keyboard shortcuts to navigate.
Keys ActionTo build the agent on Windows, see datadog-agent-buildimages.
"},{"location":"setup/#linux-and-macos","title":"Linux and macOS","text":""},{"location":"setup/#python","title":"Python","text":"The Agent embeds a full-fledged CPython interpreter so it requires the development files to be available in the dev env. The Agent can embed Python 2 and/or Python 3, you will need development files for all versions you want to support.
If you're on OSX/macOS, installing Python 2.7 and/or 3.11 with Homebrew:
brew install python@2\nbrew install python@3.11\n
On Linux, depending on the distribution, you might need to explicitly install the development files, for example on Ubuntu:
sudo apt-get install python2.7-dev\nsudo apt-get install python3.11-dev\n
On Windows, install Python 2.7 and/or 3.11 via the official installer brings along all the development files needed:
Warning
If you don't use one of the Python versions that are explicitly supported, you may have problems running the built Agent's Python checks, especially if using a virtualenv. At this time, only Python 3.11 is confirmed to work as expected in the development environment.
"},{"location":"setup/#python-dependencies","title":"Python Dependencies","text":""},{"location":"setup/#preface","title":"Preface","text":"Invoke is a task runner written in Python that is extensively used in this project to orchestrate builds and test runs. To run the tasks, you need to have it installed on your machine. We offer two different ways to run our invoke tasks.
"},{"location":"setup/#deva-recommended","title":"deva
(recommended)","text":"The deva
CLI tool is a single binary that can be used to install and manage the development environment for the Agent, built by the Datadog team. It will install all the necessary Python dependencies for you. The development environment will be completely independent of your system Python installation. This tool leverages PyApp, a wrapper for Python applications that bootstrap themselves at runtime. In our case, we wrap invoke
itself and include the dependencies needed to work on the Agent.
To install deva
, you'll need to:
deva
in place of invoke
or inv
.The Python environment will automatically be created on the first run. and will be reused for subsequent runs. For example:
cd datadog-agent\ncurl -L -o deva https://github.com/DataDog/datadog-agent-devtools/releases/download/deva-v1.0.0/deva-aarch64-unknown-linux-gnu-1.0.0\nchmod +x deva\n./deva linter.go\n
Below a live demo of how the tool works:
If you want to uninstall deva
, you can simply run the ./deva self remove
command, which will remove the virtual environment from your system, and remove the binary. That's it.
To protect and isolate your system-wide python installation, a python virtual environment is highly recommended (though optional). It will help keep a self-contained development environment and ensure a clean system Python.
Note
Due to the way some virtual environments handle executable paths (e.g. python -m venv
), not all virtual environment options will be able to run the built Agent correctly. At this time, the only confirmed virtual environment creator that is known for sure to work is virtualenv
.
python3 -m pip install virtualenv\n
virtualenv $GOPATH/src/github.com/DataDog/datadog-agent/venv\n
If using virtual environments when running the built Agent, you may need to override the built Agent's search path for Python check packages using the PYTHONPATH
variable (your target path must have the pre-requisite core integration packages installed though).
PYTHONPATH=\"./venv/lib/python3.11/site-packages:$PYTHONPATH\" ./agent run ...\n
See also some notes in ./checks about running custom python checks.
"},{"location":"setup/#install-invoke-and-its-dependencies","title":"Install Invoke and its dependencies","text":"Our invoke tasks are only compatible with Python 3, thus you will need to use Python 3 to run them.
Though you may install invoke in a variety of way we suggest you use the provided requirements file and pip
:
pip install -r tasks/requirements.txt\n
This procedure ensures you not only get the correct version of invoke
, but also any additional python dependencies our development workflow may require, at their expected versions. It will also pull other handy development tools/deps (reno
, or docker
).
You must install Golang version 1.22.8
or higher. Make sure that $GOPATH/bin
is in your $PATH
otherwise invoke
cannot use any additional tool it might need.
Note
Versions of Golang that aren't an exact match to the version specified in our build images (see e.g. here) may not be able to build the agent and/or the rtloader binary properly.
"},{"location":"setup/#installing-tooling","title":"Installing tooling","text":"From the root of datadog-agent
, run invoke install-tools
to install go tooling. This uses go
to install the necessary dependencies.
When working on the Agent codebase you can choose among two different ways to build the binary, informally named System and Embedded builds. For most contribution scenarios you should rely on the System build (the default) and use the Embedded one only for specific use cases. Let's explore the differences.
"},{"location":"setup/#system-build","title":"System build","text":"System builds use your operating system's standard system libraries to satisfy the Agent's external dependencies. Since, for example, macOS 10.11 may provide a different version of Python than macOS 10.12, system builds on each of these platforms may produce different Agent binaries. If this doesn't matter to you\u2014perhaps you just want to contribute a quick bugfix\u2014do a System build; it's easier and faster than an Embedded build. System build is the default for all build and test tasks, so you don't need to configure anything there. But to make sure you have system copies of all the Agent's dependencies, skip the Embedded build section below and read on to see how to install them via your usual package manager (apt, yum, brew, etc).
"},{"location":"setup/#embedded-build","title":"Embedded build","text":"Embedded builds download specifically-versioned dependencies and compile them locally from sources. We run Embedded builds to create Datadog's official Agent releases (i.e. RPMs, debs, etc), and while you can run the same builds while developing locally, the process is as slow as it sounds. Hence, you should only use them when you care about reproducible builds. For example:
Embedded builds rely on Omnibus to download and build dependencies, so you need a recent ruby
environment with bundler
installed. See how to build Agent packages with Omnibus for more details.
The agent is able to collect systemd journal logs using a wrapper on the systemd utility library.
On Ubuntu/Debian:
sudo apt-get install libsystemd-dev\n
On Redhat/CentOS:
sudo yum install systemd-devel\n
"},{"location":"setup/#docker","title":"Docker","text":"If you want to build a Docker image containing the Agent, or if you wan to run system and integration tests you need to run a recent version of Docker in your dev environment.
"},{"location":"setup/#doxygen","title":"Doxygen","text":"We use Doxygen to generate the documentation for the rtloader
part of the Agent.
To generate it (using the invoke rtloader.generate-doc
command), you'll need to have Doxygen installed on your system and available in your $PATH
. You can compile and install Doxygen from source with the instructions available here. Alternatively, you can use already-compiled Doxygen binaries from here.
To get the dependency graphs, you may also need to install the dot
executable from graphviz and add it to your $PATH
.
It is optional but recommended to install pre-commit
to run a number of checks done by the CI locally.
To install it, run:
python3 -m pip install pre-commit\npre-commit install\n
The shellcheck
pre-commit hook requires having the shellcheck
binary installed and in your $PATH
. To install it, run:
deva install-shellcheck --destination <path>\n
(by default, the shellcheck binary is installed in /usr/local/bin
).
pre-commit
","text":"If you want to skip pre-commit
for a specific commit you can add --no-verify
to the git commit
command.
pre-commit
manually","text":"If you want to run one of the checks manually, you can run pre-commit run <check name>
.
You can run it on all files with the --all-files
flag.
pre-commit run flake8 --all-files # run flake8 on all files\n
See pre-commit run --help
for further options.
Microsoft Visual Studio Code with the devcontainer plugin allow to use a container as remote development environment in vscode. It simplify and isolate the dependencies needed to develop in this repository.
To configure the vscode editor to use a container as remote development environment you need to:
deva vscode.setup-devcontainer --image \"<image name>\"
. This command will create the devcontainer configuration file ./devcontainer/devcontainer.json
.Microsoft Visual Studio Code is recommended as it's lightweight and versatile.
Building on Windows requires multiple 3rd-party software to be installed. To avoid the complexity, Datadog recommends to make the code change in VS Code, and then do the build in Docker image. For complete information, see Build the Agent packages
"},{"location":"architecture/dogstatsd/internals/","title":"DogStatsD internals","text":"(click to enlarge)
Information on DogStatsD, configuration and troubleshooting is available in the Datadog documentation.
"},{"location":"architecture/dogstatsd/internals/#packet","title":"Packet","text":"In DogStatsD, a Packet is a bytes array containing one or multiple metrics in the DogStatsD format (separated by a \\n
when there are several). Its maximum size is dogstatsd_buffer_size
.
\\n
The PacketAssembler gathers multiple datagrams into one Packet of maximum size, dogstatsd_buffer_size
, and sends it to the PacketsBuffer which avoids running the whole parsing pipeline with only one metric per packet. The bytes buffer used comes from the PacketPool, which avoids re-allocating the bytes buffer every time.
Note
The UDS pipeline does not use the PacketAssembler because each UDS packet also contains metadata (origin tags) which are used to enrich the metrics tags, making them impossible to be packed together by the PacketAssembler.
The PacketAssembler does not allocate a bytes array every time it has to use one. It retrieves one from a pool containing pre-allocated arrays and this pool never empties. The PacketAssembler allocates a new bytes array when it\u2019s needed. Once fully assembled by the PacketAssembler, the bytes array is sent through the rest of the DogStatsD pipeline and ownership is allocated to each part using it (PacketsBuffer, Worker). Eventually, the Worker takes care of returning it to the pool when the part has processed its content.
"},{"location":"architecture/dogstatsd/internals/#packetbuffer","title":"PacketBuffer","text":"\\n
)The PacketsBuffer buffers multiple Packets (in a slice), this way the parsing part of the pipeline is going through several Packets in a row instead of only one each time it is called. This leads to less CPU usage. PacketsBuffer sends the Packets for processing when either:
a. The buffer is full (contains dogstatsd_packet_buffer_size, default value: 32
)
or
b. A timer is triggered (i.e. dogstatsd_packer_buffer_flush_timeout, default value: 100ms
)
The PacketBuffer sends it in a Go buffered channel to the worker / parser, meaning that the channels can buffer the Packets on their own while waiting for the worker to read and process them.
In theory, the max memory usage of this Go buffered channel is:
dogstatsd_packer_buffer_size
* dogstatsd_buffer_size
* dogstatsd_queue_size
To this we can add per-listener buffers: dogstatsd_packer_buffer_size
* dogstatsd_buffer_size
* connections
. connections
will be 1 for uds
and udp
and one per client for uds-stream
.
The Worker is the part of the DogStatsD server responsible for parsing the metrics in the bytes array and turning them into MetricSamples.
The server spawns multiple workers based on the amount of cores available on the host:
(number of cores - 2)
workers. If this result is less than 2, the server spawns 2 workers.(number of cores / 2)
workers. If this result is less than 2, the server spawns 2 workers.The Worker is using a system called StringInterner to not allocate memory every time a string is needed. Note that this StringInterner is caching a finite number of strings and when it is full it is emptied to start caching strings again. Its size is configurable with dogstatsd_string_interner_size
.
The MetricSamples created are not directly sent to the Agent Demultiplexer but first to a part called the Batcher.
"},{"location":"architecture/dogstatsd/internals/#batcher","title":"Batcher","text":"The role of the Batcher is to accumulate multiple MetricSamples before sending them to the Agent Demultiplexer. Every time it accumulates 32 MetricSamples, the Batcher sends them to the Demultiplexer. The Batcher sends 32 MetricSamples in a channel buffering 100 sets. There is one channel per TimeSampler.
The size of a MetricSample depends on the size of the host's hostname, its metric name, and its number of tags. An example MetricSample with a 20 character hostname, 40 character metric name, and 200 characters of tags has a size of approximately 264 bytes. A Batcher can use a maximum of 844kb of memory.
"},{"location":"architecture/dogstatsd/internals/#timesamplerworker","title":"TimeSamplerWorker","text":"The TimeSamplerWorker runs in an infinite loop. It is responsible for the following:
The following calculations determine the number of TimeSamplerWorker and TimeSampler instances:
dogstatsd_pipeline_autoadjust
is true
then the workers count will be automatically adjusted.dogstatsd_pipeline_count
has a value, the number of TimeSampler pipelines equals that value.dogstatsd_pipeline_autoadjust_strategy
can be set to the following values:
max_throughput
: The number of TimeSampler pipelines is adjusted to maximize throughput. There are (number of core/2) - 1
instances of TimeSampler.per_origin
: The number of TimeSampler pipelines is adjusted to improve data locality. The number of dsdWorker instances is equal to half the number of cores. and the number of TimeSampler pipelines is equal dogstatsd_pipeline_count
or twice the number of cores. This strategy will provide a better compression ratio in shared environments and improve resource allocation fairness within the agent.The NoAggregationStreamWorker runs an infinite loop in a goroutine. It receives metric samples with timestamps, and it batches them to be sent as quickly as possible to the intake. It performs no aggregation nor extra processing, except from adding tags to the metrics.
It runs only when dogstatsd_no_aggregation_pipeline
is set to true
.
The payload being sent to the intake (through the normal Serializer
/Forwarder
pieces) contains, at maximum, dogstatsd_no_aggregation_pipeline_batch_size
metrics. This value defaults to 2048
.
Fx groups help you produce and group together values of the same type, even if these values are produced in different parts of the codebase. A component can add any type into a group; this group can then consumed by other components.
In the following example, a component add a server.Endpoint
type to the server
group.
type Provides struct {\n comp Component\n Endpoint server.Endpoint `group:\"server\"`\n}\n
In the following example, a component requests all the types added to the server
group. This takes the form of a slice received at instantiation.
type Requires struct {\n Endpoints []Endpoint `group:\"server\"`\n}\n
"},{"location":"components/creating-bundles/","title":"Creating a bundle","text":"A bundle is a grouping of related components. The goal of a bundle is to ease the usage of multiple components working together to constitute a product.
One example is DogStatsD
, a server to receive metrics locally from customer apps. DogStatsD
is composed of 9+ components, but at the binary level we want to include DogStatsD
as a whole.
For use cases like that of DogStatsD, create a bundle.
"},{"location":"components/creating-bundles/#creating-a-bundle_1","title":"Creating a bundle","text":"A bundle eases the aggregation of multiple components and lives in comp/<bundlesName>/
.
// Package <bundleName> ...\npackage <bundleName>\n\nimport (\n \"github.com/DataDog/datadog-agent/pkg/util/fxutil\"\n\n // We import all the components that we want to aggregate. A bundle must only aggregate components within its\n // sub-folders.\n comp1fx \"github.com/DataDog/datadog-agent/comp/<bundleName>/comp1/fx\"\n comp2fx \"github.com/DataDog/datadog-agent/comp/<bundleName>/comp2/fx\"\n comp3fx \"github.com/DataDog/datadog-agent/comp/<bundleName>/comp3/fx\"\n comp4fx \"github.com/DataDog/datadog-agent/comp/<bundleName>/comp4/fx\"\n)\n\n// A single team must own the bundle, even if they don't own all the sub-components\n// team: <the team owning the bundle>\n\n// Bundle defines the fx options for this bundle.\nfunc Bundle() fxutil.BundleOptions {\n return fxutil.Bundle(\n comp1fx.Module(),\n comp2fx.Module(),\n comp3fx.Module(),\n comp4fx.Module(),\n}\n
A bundle doesn't need to import all sub components. The idea is to offer a default, easy to use grouping of components. But nothing prevents users from cherry-picking the components they want to use.
"},{"location":"components/creating-components/","title":"Creating a Component","text":"This page explains how to create components in detail.
This page uses the example of creating a compression component. This component compresses a payload before sending it to the Datadog backend.
Since there are multiple ways to compress data, this component provides two implementations of the same interface:
A component contains multiple folders and Go packages. Developers split a component into packages to isolate the interface from the implementations and improve code sharing. Declaring the interface in a separate package from the implementation allows you to import the interface without importing all of the implementations.
"},{"location":"components/creating-components/#file-hierarchy","title":"File hierarchy","text":"All components are located in the comp
folder at the top of the Agent repo.
The file hierarchy is as follows:
comp /\n <bundle name> / <-- Optional\n <comp name> /\n def / <-- The folder containing the component interface and ALL its public types.\n impl / <-- The only or primary implementation of the component.\n impl-<alternate> / <-- An alternate implementation.\n impl-none / <-- Optional. A noop implementation.\n fx / <-- All fx related logic for the primary implementation, if any.\n fx-<alternate> / <-- All fx related logic for a specific implementation.\n mock / <-- The mock implementation of the component to ease testing.\n
To note:
impl
folder.impl-<version>
folders instead of an impl
folder. For example, your compression component has impl-zstd
and impl-zip
folders, but not an impl
folder.impl-none
folder.This file hierarchy aims to solve a few problems:
def
folders and never care about which implementation was loaded in the main function.fx
folder per implementation, to allow binaries to import/link against a single folder.You can use the invoke task deva components.new-component comp/<COMPONENT_NAME>
to generate a scaffold for your new component.
Every public variable, function, struct, and interface of your component must be documented. Refer to the Documentation section below for details.
"},{"location":"components/creating-components/#the-def-folder","title":"The def folder","text":"The def
folder contains your interface and ALL public types needed by the users of your component.
In the example of a compression component, the def folder looks like this:
comp/compression/def/component.go// Package compression contains all public type and interfaces for the compression component\npackage compression\n\n// team: <your team>\n\n// Component describes the interface implemented by all compression implementations.\ntype Component interface {\n // Compress compresses the input data.\n Compress([]byte) ([]byte, error)\n\n // Decompress decompresses the input data.\n Decompress([]byte) ([]byte, error)\n}\n
All component interfaces must be called Component
, so all imports have the form <COMPONENT_NAME>.Component
.
You can see that the interface only exposes the bare minimum. You should aim at having the smallest possible interface for your component.
When defining a component interface, avoid using structs or interfaces from third-party dependencies.
Interface using a third-party dependency
package telemetry\n\nimport \"github.com/prometheus/client_golang/prometheus\"\n\n// team: agent-shared-components\n\n// Component is the component type.\ntype Component interface {\n // RegisterCollector Registers a Collector with the prometheus registry\n RegisterCollector(c prometheus.Collector)\n}\n
In the example above, every user of the telemetry
component would have to import github.com/prometheus/client_golang/prometheus
no matter which implementation they use.
In general, be mindful of using external types in the public interface of your component. For example, it would make sense to use Docker types in a docker
component, but not in a container
component.
The impl
folder is where the component implementation is written. The details of component implementation are up to the developer. The only requirements are that the package name follows the pattern <COMPONENT_NAME>impl
and that there is a public instantiation function called NewComponent
.
package compressionimpl\n\n// NewComponent returns a new ZSTD implementation for the compression component\nfunc NewComponent(reqs Requires) Provides {\n ....\n}\n
To require input arguments to the NewComponent
instantiation function, use a special struct named Requires
. The instantiation function returns a special stuct named Provides
. This internal nomenclature is used to handle the different component dependencies using Fx groups.
In this example, the compression component must access the configuration component and the log component. To express this, define a Requires
struct with two fields. The name of the fields is irrelevant, but the type must be the concrete type of interface that you require.
package compressionimpl\n\nimport (\n \"fmt\"\n\n config \"github.com/DataDog/datadog-agent/comp/core/config/def\"\n log \"github.com/DataDog/datadog-agent/comp/core/log/def\"\n)\n\n// Here, list all components and other types known by Fx that you need.\n// To be used in `fx` folders, type and field need to be public.\n//\n// In this example, you need config and log components.\ntype Requires struct {\n Conf config.Component\n Log log.Component\n}\n
Using other components
If you want to use another component within your own, add it to the Requires
struct, and Fx
will give it to you at initialization. Be careful of circular dependencies.
For the output of the component, populate the Provides
struct with the return values.
package compressionimpl\n\nimport (\n // Always import the component def folder, so that you can return a 'compression.Component' type.\n compression \"github.com/DataDog/datadog-agent/comp/compression/def\"\n)\n\n// Here, list all the types your component is going to return. You can return as many types as you want; all of them are available through Fx in other components.\n// To be used in `fx` folders, type and field need to be public.\n//\n// In this example, only the compression component is returned.\ntype Provides struct {\n Comp compression.Component\n}\n
All together, the component code looks like the following:
comp/compression/impl-zstd/compressor.gopackage compressionimpl\n\nimport (\n \"fmt\"\n\n compression \"github.com/DataDog/datadog-agent/comp/compression/def\"\n config \"github.com/DataDog/datadog-agent/comp/core/config/def\"\n log \"github.com/DataDog/datadog-agent/comp/core/log/def\"\n)\n\ntype Requires struct {\n Conf config.Component\n Log log.Component\n}\n\ntype Provides struct {\n Comp compression.Component\n}\n\n// The actual type implementing the 'Component' interface. This type MUST be private, you need the guarantee that\n// components can only be used through their respective interfaces.\ntype compressor struct {\n // Keep a ref on the config and log components, so that you can use them in the 'compressor' methods\n conf config.Component\n log log.Component\n\n // any other field you might need\n}\n\n// NewComponent returns a new ZSTD implementation for the compression component\nfunc NewComponent(reqs Requires) Provides {\n // Here, do whatever is needed to build a ZSTD compression comp.\n\n // And create your component\n comp := &compressor{\n conf: reqs.Conf,\n log: reqs.Log,\n }\n\n return Provides{\n comp: comp,\n }\n}\n\n//\n// You then need to implement all methods from your 'compression.Component' interface\n//\n\n// Compress compresses the input data using ZSTD\nfunc (c *compressor) Compress(data []byte) ([]byte, error) {\n c.log.Debug(\"compressing a buffer with ZSTD\")\n\n // [...]\n return compressData, nil\n}\n\n// Decompress decompresses the input data using ZSTD.\nfunc (c *compressor) Decompress(data []byte) ([]byte, error) {\n c.log.Debug(\"decompressing a buffer with ZSTD\")\n\n // [...]\n return compressData, nil\n}\n
The constructor can return either a Provides
, if it is infallible, or (Provides, error)
, if it could fail. In the latter case, a non-nil error results in the Agent crashing at startup with a message containing the error.
Each implementation follows the same pattern.
"},{"location":"components/creating-components/#the-fx-folders","title":"The fx folders","text":"The fx
folder must be the only folder importing and referencing Fx. It's meant to be a simple wrapper. Its only goal is to allow dependency injection with Fx for your component.
All fx.go
files must define a func Module() fxutil.Module
function. The helpers contained in fxutil
handle all the logic. Most fx/fx.go
file should look the same as this:
package fx\n\nimport (\n \"github.com/DataDog/datadog-agent/pkg/util/fxutil\"\n\n // You must import the implementation you are exposing through FX\n compressionimpl \"github.com/DataDog/datadog-agent/comp/compression/impl-zstd\"\n)\n\n// Module specifies the compression module.\nfunc Module() fxutil.Module {\n return fxutil.Component(\n // ProvideComponentConstructor will automatically detect the 'Requires' and 'Provides' structs\n // of your constructor function and map them to FX.\n fxutil.ProvideComponentConstructor(\n compressionimpl.NewComponent,\n )\n )\n}\n
Optional dependencies
To create an optional wrapper type for your component, you can use the helper function fxutil.ProvideOptional
. This generic function requires the type of the component interface, and will automatically make a conversion function optional.Option
for that component.
More on this in the FAQ.
For the ZIP implementation, create the same file in fx-zip
folder. In most cases, your component has a single implementation. If so, you have only one impl
and fx
folder.
fx-none
","text":"Some parts of the codebase might have optional dependencies on your components (see FAQ).
If it's the case, you need to provide a fx wrapper called fx-none
to avoid duplicating the use of optional.NewNoneOption[def.Component]()
in all our binaries
import (\n compression \"github.com/DataDog/datadog-agent/comp/compression/def\"\n)\n\nfunc Module() fxutil.Module {\n return fxutil.Component(\n fx.Provide(func() optional.Option[compression.Component] {\n return optional.NewNoneOption[compression.Component]()\n }))\n}\n
"},{"location":"components/creating-components/#the-mock-folder","title":"The mock folder","text":"To support testing, components MUST provide a mock implementation (unless your component has no public method in its interface).
Your mock must implement the Component
interface of the def
folder but can expose more methods if needed. All mock constructors must take a *testing.T
as parameter.
In the following example, your mock has no dependencies and returns the same string every time.
comp/compression/mock/mock.go//go:build test\n\npackage mock\n\nimport (\n \"testing\"\n\n compression \"github.com/DataDog/datadog-agent/comp/compression/def\"\n)\n\ntype Provides struct {\n Comp compression.Component\n}\n\ntype mock struct {}\n\n// New returns a mock compressor\nfunc New(*testing.T) Provides {\n return Provides{\n comp: &mock{},\n }\n}\n\n// Compress compresses the input data using ZSTD\nfunc (c *mock) Compress(data []byte) ([]byte, error) {\n return []byte(\"compressed\"), nil\n}\n\n// Decompress decompresses the input data using ZSTD.\nfunc (c *compressor) Decompress(data []byte) ([]byte, error) {\n return []byte(\"decompressed\"), nil\n}\n
"},{"location":"components/creating-components/#go-module","title":"Go module","text":"Go modules are not mandatory, but if you want to allow your component to be used outside the datadog-agent
repository, create Go modules in the following places:
impl
/impl-*
folder that you want to expose (you can only expose some implementations).def
folder to expose the interfacemock
folder to expose the mockNever add a Go module to the component folder (for example,comp/compression
) or any fx
folders.
In the end, a classic component folder should look like:
comp/<COMPONENT_NAME>/\n\u251c\u2500\u2500 def\n\u2502 \u2514\u2500\u2500 component.go\n\u251c\u2500\u2500 fx\n\u2502 \u2514\u2500\u2500 fx.go\n\u251c\u2500\u2500 impl\n\u2502 \u2514\u2500\u2500 component.go\n\u2514\u2500\u2500 mock\n \u2514\u2500\u2500 mock.go\n\n4 directories, 4 files\n
The example compression component, which has two implementations, looks like:
comp/core/compression/\n\u251c\u2500\u2500 def\n\u2502 \u2514\u2500\u2500 component.go\n\u251c\u2500\u2500 fx-zip\n\u2502 \u2514\u2500\u2500 fx.go\n\u251c\u2500\u2500 fx-zstd\n\u2502 \u2514\u2500\u2500 fx.go\n\u251c\u2500\u2500 impl-zip\n\u2502 \u2514\u2500\u2500 component.go\n\u251c\u2500\u2500 impl-zstd\n\u2502 \u2514\u2500\u2500 component.go\n\u2514\u2500\u2500 mock\n \u2514\u2500\u2500 mock.go\n\n6 directories, 6 files\n
This can seem like a lot for a single compression component, but this design answers the exponentially increasing complexity of the Agent ecosystem. Your component needs to behave correctly with many binaries composed of unique and shared components, outside repositories that want to pull only specific features, and everything in between.
Important
No components know how or where they will be used and MUST, therefore, respect all the rules above. It's a very common pattern for teams to work only on their use cases, thinking their code will not be used anywhere else. But customers want common behavior between all Datadog products (Agent, serverless, Agentless, Helm, Operator, etc.).
A key idea behind the component is to produce shareable and reusable code.
"},{"location":"components/creating-components/#general-consideration-about-designing-components","title":"General consideration about designing components","text":"Your component must:
The documentation (both package-level and method-level) should include everything a user of the component needs to know. In particular, the documentation must address any assumptions that might lead to panic if violated by the user.
Detailed documentation of how to avoid bugs in using a component is an indicator of excessive complexity and should be treated as a bug. Simplifying the usage will improve the robustness of the Agent.
Documentation should include:
Precise information about data ownership of passed values and returned values. Users can assume that any mutable value returned by a component will not be modified by the user or the component after it is returned. Similarly, any mutable value passed to a component will not be later modified, whether by the component or the caller. Any deviation from these defaults should be documented.
Note
It can be surprisingly hard to avoid mutating data -- for example, append(..)
surprisingly mutates its first argument. It is also hard to detect these bugs, as they are often intermittent, cause silent data corruption, or introduce rare data races. Where performance is not an issue, prefer to copy mutable input and outputs to avoid any potential bugs.
Precise information about goroutines and blocking. Users can assume that methods do not block indefinitely, so blocking methods should be documented as such. Methods that invoke callbacks should be clear about how the callback is invoked, and what it might do. For example, document whether the callback can block, and whether it might be called concurrently with other code.
You might need to express the fact that some of your dependencies are optional. This often happens for components that interact with many other components if available (that is, if they were included at compile time). This allows your component to interact with each other without forcing their inclusion in the current binary.
The optional.Option type answers this need.
For examples, consider the metadata components that are included in multiple binaries (core-agent
, DogStatsD
, etc.). These components use the sysprobeconfig
component if it is available. sysprobeconfig
is available in the core-agent
but not in DogStatsD
.
To do this in the metadata
component:
type Requires struct {\n SysprobeConf optional.Option[sysprobeconfig.Component]\n [...]\n}\n\nfunc NewMetadata(deps Requires) (metadata.Component) {\n if sysprobeConf, found := deps.SysprobeConf.Get(); found {\n // interact with sysprobeconfig\n }\n}\n
The above code produces a generic component, included in both core-agent
and DogStatsD
binaries, that can interact with sysprobeconfig
without forcing the binaries to compile with it.
You can use this pattern for every component, since all components provide Fx with a conversion function to convert their Component
interfaces to optional.Option[Component]
(see creating components).
The Agent uses Fx as its application framework. While the linked Fx documentation is thorough, it can be a bit difficult to get started with. This document describes how Fx is used within the Agent in a more approachable style.
"},{"location":"components/fx/#what-is-it","title":"What Is It?","text":"Fx's core functionality is to create instances of required types \"automatically,\" also known as dependency injection. Within the agent, these instances are components, so Fx connects components to one another. Fx creates a single instance of each component, on demand.
This means that each component declares a few things about itself to Fx, including the other components it depends on. An \"app\" then declares the components it contains to Fx, and instructs Fx to start up the whole assembly.
"},{"location":"components/fx/#providing-and-requiring","title":"Providing and Requiring","text":"Fx connects components using types. Within the Agent, these are typically interfaces named Component
. For example, scrubber.Component
might be an interface defining functionality for scrubbing passwords from data structures:
type Component interface {\n ScrubString(string) string\n}\n
Fx needs to know how to provide an instance of this type when needed, and there are a few ways:
fx.Provide(NewScrubber)
where NewScrubber
is a constructor that returns a scrubber.Component
. This indicates that if and when a scrubber.Component
is required, Fx should call NewScrubber
. It will call NewScrubber
only once, using the same value everywhere it is required.fx.Supply(scrubber)
where scrubber
implements the scrubber.Component
interface. When another component requires a scrubber.Component
, this is the instance it will get.The first form is much more common, as most components have constructors that do interesting things at runtime. A constructor can return multiple arguments, in which case the constructor is called if any of those argument types are required. Constructors can also return error
as the final return type. Fx will treat an error as fatal to app startup.
Fx also needs to know when an instance is required, and this is where the magic happens. In specific circumstances, it uses reflection to examine the argument list of functions, and creates instances of each argument's type. Those circumstances are:
fx.Provide
. Imagine NewScrubber
depends on the config module to configure secret matchers: func NewScrubber(config config.Component) Component {\n return &scrubber{\n matchers: makeMatchersFromConfig(config),\n }\n}\n
fx.Invoke
: fx.Invoke(func(sc scrubber.Component) {\n fmt.Printf(\"scrubbed: %s\", sc.ScrubString(somevalue))\n})\n
Like constructors, Invoked functions can take multiple arguments, and can optionally return an error. Invoked functions are called automatically when an app is created.Pointers passed to fx.Populate
.
var sc scrubber.Component\n// ...\nfx.Populate(&sc)\n
Populate is useful in tests to fill an existing variable with a provided value. It's equivalent to fx.Invoke(func(tmp scrubber.Component) { *sc = tmp })
. Functions can take multple arguments of different types, requiring all of them.
You may have noticed that all of the fx
methods defined so far return an fx.Option
. They don't actually do anything on their own. Instead, Fx uses the functional options pattern from Rob Pike. The idea is that a function takes a variable number of options, each of which has a different effect on the result.
In Fx's case, the function taking the options is fx.New
, which creates a new fx.App
. It's within the context of an app that requirements are met, constructors are called, and so on.
Tying the example above together, a very simple app might look like this:
someValue = \"my password is hunter2\"\napp := fx.New(\n fx.Provide(scrubber.NewScrubber),\n fx.Invoke(func(sc scrubber.Component) {\n fmt.Printf(\"scrubbed: %s\", sc.ScrubString(somevalue))\n }))\napp.Run()\n// Output: scrubbed: my password is *******\n
For anything more complex, it's not practical to call fx.Provide
for every component in a single source file. Fx has two abstraction mechanisms that allow combining lots of options into one app:
fx.Options
simply bundles several Option values into a single Option that can be placed in a variable. As the example in the Fx documentation shows, this is useful to gather the options related to a single Go package, which might include un-exported items, into a single value typically named Module
.fx.Module
is very similar, with two additional features. First, it requires a module name which is used in some Fx logging and can help with debugging. Second, it creates a scope for the effects of fx.Decorate
and fx.Replace
. The second feature is not used in the Agent.So a slightly more complex version of the example might be:
scrubber/component.go main.gofunc Module() fxutil.Module {\n return fx.Module(\"scrubber\",\n fx.Provide(newScrubber)) // now newScrubber need not be exported\n}\n
someValue = \"my password is hunter2\"\napp := fx.New(\n scrubber.Module(),\n fx.Invoke(func(sc scrubber.Component) {\n fmt.Printf(\"scrubbed: %s\", sc.ScrubString(somevalue))\n }))\napp.Run()\n// Output: scrubbed: my password is *******\n
"},{"location":"components/fx/#lifecycle","title":"Lifecycle","text":"Fx provides an fx.Lifecycle
component that allows hooking into application start-up and shut-down. Use it in your component's constructor like this:
func newScrubber(lc fx.Lifecycle) Component {\n sc := &scrubber{..}\n lc.Append(fx.Hook{OnStart: sc.start, OnStop: sc.stop})\n return sc\n}\n\nfunc (sc *scrubber) start(ctx context.Context) error { .. }\nfunc (sc *scrubber) stop(ctx context.Context) error { .. }\n
This separates the application's lifecycle into a few distinct phases:
Fx provides some convenience types to help build constructors that require or provide lots of types: fx.In
and fx.Out
. Both types are embedded in structs, which can then be used as argument and return types for constructors, respectively. By convention, these are named dependencies
and provides
in Agent code:
type dependencies struct {\n fx.In\n\n Config config.Component\n Log log.Component\n Status status.Component\n)\n\ntype provides struct {\n fx.Out\n\n Component\n // ... (we'll see why this is useful below)\n}\n\nfunc newScrubber(deps dependencies) (provides, error) { // can return an fx.Out struct and other types, such as error\n // ..\n return provides {\n Component: scrubber,\n // ..\n }, nil\n}\n
In and Out provide a nice way to summarize and document requirements and provided types, and also allow annotations via Go struct tags. Note that annotations are also possible with fx.Annotate
, but it is much less readable and its use is discouraged.
Value groups make it easier to produce and consume many values of the same type. A component can add any type into groups which can be consumed by other components.
For example:
Here, two components add a server.Endpoint
type to the server
group (note the group
label in the fx.Out
struct).
type provides struct {\n fx.Out\n Component\n Endpoint server.Endpoint `group:\"server\"`\n}\n
type provides struct {\n fx.Out\n Component\n Endpoint server.Endpoint `group:\"server\"`\n}\n
Here, a component requests all the types added to the server
group. This takes the form of a slice received at instantiation (note once again the group
label but in fx.In
struct).
type dependencies struct {\n fx.In\n Endpoints []Endpoint `group:\"server\"`\n}\n
"},{"location":"components/fx/#day-to-day-usage","title":"Day-to-Day Usage","text":"Day-to-day, the Agent's use of Fx is fairly formulaic. Following the component guidelines, or just copying from other components, should be enough to make things work without a deep understanding of Fx's functionality.
"},{"location":"components/migration/","title":"Integrating with other components","text":"After you create your component, you can link it to other components such as flares. (Others, like status pages or health, will come later).
This section documents how to fully integrate your component in the Agent ecosystem.
"},{"location":"components/overview/","title":"Overview of components","text":"The Agent is structured as a collection of components working together. Depending on how the binary is built, and how it is invoked, different components may be instantiated.
"},{"location":"components/overview/#what-is-a-component","title":"What is a component?","text":"The goal of a component is to encapsulate a particular piece of logic/feature and provide a clear and documented interface.
A component must:
Any change within a component that don't change its interface should not require QA of another component using it.
Since each component is an interface to the outside, it can have several implementations.
"},{"location":"components/overview/#fx-vs-go-module","title":"Fx vs Go module","text":"Components are designed to be used with a dependency injection framework. In the Agent, we use Fx, a dependency injection framework, for this. All Agent binaries use Fx to load, coordinate, and start the required components.
Some components are used outside the datadog-agent
repository, where Fx is not available. To support this, the components implementation must not require Fx. Component implementations can be exported as Go modules. The next section explains in more detail how to create components.
The important information here is that it's possible to use components without Fx outside the Agent repository. This comes at the cost of manually doing the work of Fx.
"},{"location":"components/overview/#important-note-on-fx","title":"Important note on Fx","text":"The component framework project's core goal is to improve the Agent codebase by decoupling parts of the code, removing global state and init functions, and increasing reusability by separating logical units into components. Fx itself is not intrinsic to the benefits of componentization.
"},{"location":"components/overview/#next","title":"Next","text":"Next, see how to create a bundle and a component by using Fx.
"},{"location":"components/testing/","title":"Testing components","text":"Testing is an essential part of the software development life cycle. This page covers everything you need to know about testing components.
One of the core benefits of using components is that each component isolates its internal logic behind its interface. Focus on asserting that each implementation behaves correctly.
To recap from the previous page, a component was created that compresses the payload before sending it to the Datadog backend. The component has two separate implementations.
This is the component's interface:
comp/compression/def/component.gotype Component interface {\n // Compress compresses the input data.\n Compress([]byte) ([]byte, error)\n\n // Decompress decompresses the input data.\n Decompress([]byte) ([]byte, error)\n}\n
Ensure the Compress
and Decompress
functions behave correctly.
Writing tests for a component implementation follows the same rules as any other test in a Go project. See the testing package documentation for more information.
For this example, write a test file for the zstd
implementation. Create a new file named component_test.go
in the impl-zstd folder
. Inside the test file, initialize the component's dependencies, create a new component instance, and test the behavior.
All components expect a Requires
struct with all the necessary dependencies. To ensure a component instance can be created, create a requires
instance.
The Requires
struct declares a dependency on the config component and the log component. The following code snippet shows how to create the Require
struct:
package implzstd\n\nimport (\n \"testing\"\n\n configmock \"github.com/DataDog/datadog-agent/comp/core/config/mock\"\n logmock \"github.com/DataDog/datadog-agent/comp/core/log/mock\"\n)\n\nfunc TestCompress(t *testing.T) {\n logComponent := configmock.New(t)\n configComponent := logmock.New(t)\n\n requires := Requires{\n Conf: configComponent,\n Log: logComponent,\n }\n // [...]\n}\n
To create the log and config component, use their respective mocks. The mock package was mentioned previously in the Creating a Component page.
"},{"location":"components/testing/#testing-the-components-interface","title":"Testing the component's interface","text":"Now that the Require
struct is created, an instance of the component can be created and its functionality tested:
package implzstd\n\nimport (\n \"testing\"\n\n configmock \"github.com/DataDog/datadog-agent/comp/core/config/mock\"\n logmock \"github.com/DataDog/datadog-agent/comp/core/log/mock\"\n)\n\nfunc TestCompress(t *testing.T) {\n logComponent := configmock.New(t)\n configComponent := logmock.New(t)\n\n requires := Requires{\n Conf: configComponent,\n Log: logComponent,\n }\n\n provides := NewComponent(requires)\n component := provides.Comp\n\n result, err := component.Compress([]byte(\"Hello World\"))\n assert.Nil(t, err)\n\n assert.Equal(t, ..., result)\n}\n
"},{"location":"components/testing/#testing-lifecycle-hooks","title":"Testing lifecycle hooks","text":"Sometimes a component uses Fx lifecycle to add hooks. It is a good practice to test the hooks as well.
For this example, imagine a component wants to add some hooks into the app lifecycle. Some code is omitted for simplicity:
comp/somecomponent/impl/component.gopackage impl\n\nimport (\n \"context\"\n\n somecomponent \"github.com/DataDog/datadog-agent/comp/somecomponent/def\"\n compdef \"github.com/DataDog/datadog-agent/comp/def\"\n)\n\ntype Requires struct {\n Lc compdef.Lifecycle\n}\n\ntype Provides struct {\n Comp somecomponent.Component\n}\n\ntype component struct {\n started bool\n stopped bool\n}\n\nfunc (c *component) start() error {\n // [...]\n\n c.started = true\n\n return nil\n}\n\nfunc (h *healthprobe) stop() error {\n // [...]\n\n c.stopped = true\n c.started = false\n\n return nil\n}\n\n// NewComponent creates a new healthprobe component\nfunc NewComponent(reqs Requires) (Provides, error) {\n provides := Provides{}\n comp := &component{}\n\n reqs.Lc.Append(compdef.Hook{\n OnStart: func(ctx context.Context) error {\n return comp.start()\n },\n OnStop: func(ctx context.Context) error {\n return comp.stop()\n },\n })\n\n provides.Comp = comp\n return provides, nil\n}\n
The goal is to test that the component updates the started
and stopped
fields.
To accomplish this, create a new lifecycle instance, create a Require
struct instance, initialize the component, and validate that calling Start
on the lifecycle instance calls the component hook and executes the logic.
To create a lifecycle instance, use the helper function compdef.NewTestLifecycle(t *testing.T)
. The function returns a lifecycle wrapper that can be used to populate the Requires
struct. The Start
and Stop
functions can also be called.
Info
You can see the NewTestLifecycle
function here
package impl\n\nimport (\n \"context\"\n \"testing\"\n\n compdef \"github.com/DataDog/datadog-agent/comp/def\"\n \"github.com/stretchr/testify/assert\"\n)\n\nfunc TestStartHook(t *testing.T) {\n lc := compdef.NewTestLifecycle(t)\n\n requires := Requires{\n Lc: lc,\n }\n\n provides, err := NewComponent(requires)\n\n assert.NoError(t, err)\n\n assert.NotNil(t, provides.Comp)\n internalComponent := provides.Comp.(*component)\n\n ctx := context.Background()\n lc.AssertHooksNumber(1)\n assert.NoError(t, lc.Start(ctx))\n\n assert.True(t, internalComponent.started)\n}\n
For this example, a type cast operation had to be performed because the started
field is private. Depending on the component, this may not be necessary.
Using components within other components is covered on the create components page.
Now let's explore how to use components in your binaries. One of the core idea behind component design is to be able to create new binaries for customers by aggregating components.
"},{"location":"components/using-components/#the-cmd-folder","title":"thecmd
folder","text":"All main
functions and binary entry points should be in the cmd
folder.
The cmd
folder uses the following hierarchy:
cmd /\n <binary name> /\n main.go <-- The entry points from your binary\n subcommands / <-- All subcommand for your binary CLI\n <subcommand name> / <-- The code specific to a single subcommand\n command.go\n command_test.go\n
Say you want to add a test
command to the agent
CLI.
You would create the following file:
cmd/agent/subcommands/test/command.gopackage test\n\nimport (\n// [...]\n)\n\n// Commands returns a slice of subcommands for the 'agent' command.\n//\n// The Agent uses \"cobra\" to create its CLI. The command method is your entrypoint. Here, you're going to create a single\n// command.\nfunc Commands(globalParams *command.GlobalParams) []*cobra.Command {\n cmd := &cobra.Command{\n Use: \"test\",\n Short: \"a test command for the Agent\",\n Long: ``,\n RunE: func(_ *cobra.Command, _ []string) error {\n return fxutil.OneShot(\n <callback>,\n <list of dependencies>.\n )\n },\n }\n\n return []*cobra.Command{cmd}\n}\n
The code above creates a test command that does nothing. As you can see, fxutil.OneShot
helpers are being used. These helpers initialize an Fx app with all the wanted dependencies.
The next section explains how to request a dependency.
"},{"location":"components/using-components/#importing-components","title":"Importing components","text":"The fxutil.OneShot
takes a list of components and gives them to Fx. Note that this only tells Fx how to create types when they're needed. This does not do anything else.
For a component to be instantiated, it must be one of the following:
callback
functionfx.Invoke
. More on this on the Fx page.Let's require the log
components:
import (\n // First let's import the FX wrapper to require it\n logfx \"github.com/DataDog/datadog-agent/comp/core/log/fx\"\n // Then the logger interface to use it\n log \"github.com/DataDog/datadog-agent/comp/core/log/def\"\n)\n\n// [...]\n return fxutil.OneShot(\n myTestCallback, // The function to call from fxutil.OneShot\n logfx.Module(), // This will tell FX how to create the `log.Component`\n )\n// [...]\n\nfunc myTestCallback(logger log.Component) {\n logger.Info(\"some message\")\n}\n
"},{"location":"components/using-components/#importing-bundles","title":"Importing bundles","text":"Now let's say you want to include the core bundle instead. The core bundle offers many basic features (logger, config, telemetry, flare, ...).
import (\n // We import the core bundle\n core \"github.com/DataDog/datadog-agent/comp/core\"\n\n // Then the interfaces we want to use\n config \"github.com/DataDog/datadog-agent/comp/core/config/def\"\n)\n\n// [...]\n return fxutil.OneShot(\n myTestCallback, // The function to call from fxutil.OneShot\n core.Bundle(), // This will tell FX how to create the all the components included in the bundle\n )\n// [...]\n\nfunc myTestCallback(conf config.Component) {\n api_key := conf.GetString(\"api_key\")\n\n // [...]\n}\n
It's very important to understand that since myTestCallback
only uses the config.Component
, not all components from the core
bundle are instantiated! The core.Bundle
instructs Fx how to create components, but only the ones required are created.
In our example, the config.Component
might have dozens of dependencies instantiated from the core bundle. Fx handles all of this.
As your migration to components is not finished, you might need to manually instruct Fx on how to use plain types.
You will need to use fx.Supply
for this. More details can be found here.
But here is a quick example:
import (\n logfx \"github.com/DataDog/datadog-agent/comp/core/log/fx\"\n log \"github.com/DataDog/datadog-agent/comp/core/log/def\"\n)\n\n// plain custom type\ntype custom struct {}\n\n// [...]\n return fxutil.OneShot(\n myTestCallback,\n logfx.Module(),\n\n // fx.Supply populates values into Fx. \n // Any time this is needed, Fx will use it.\n fx.Supply(custom{})\n )\n// [...]\n\n// Here our function uses component and non-component type, both provided by Fx.\nfunc myTestCallback(logger log.Component, c custom) {\n logger.Info(\"Custom type: %v\", c)\n}\n
Info
This means that components can depend on plain types too (as long as the main entry point populates Fx options with them).
"},{"location":"components/shared_features/flares/","title":"Flare","text":"The general idea is to register a callback within your component to be called each time a flare is created. This uses Fx groups under the hood, but helpers are there to abstract all the complexity.
Once the callback is created, you will have to migrate the code related to your component from pkg/flare
to your component.
To add data to a flare, you first need to register a callback, also known as a FlareBuilder
.
Within your component, create a method with the following signature: func (c *yourComp) fillFlare(fb flaretypes.FlareBuilder) error
.
This function is called every time the Agent generates a flare\u2014whether from the CLI, RemoteConfig, or from the running Agent. Your callback takes a FlareBuilder as parameter. This object provides all the helpers functions needed to add data to a flare (adding files, copying directories, scrubbing data, and so on).
Example:
import (\n yaml \"gopkg.in/yaml.v2\"\n\n flare \"github.com/DataDog/datadog-agent/comp/core/flare/def\"\n)\n\nfunc (c *myComponent) fillFlare(fb flare.FlareBuilder) error {\n // Creating a new file\n fb.AddFile( \n \"runtime_config_dump.yaml\",\n []byte(\"content of my file\"),\n ) //nolint:errcheck \n\n // Copying a file from the disk into the flare\n fb.CopyFile(\"/etc/datadog-agent/datadog.yaml\") //nolint:errcheck\n return nil\n}\n
Read the FlareBuilder package documentation for more information on the API.
Any errors returned by the FlareBuilder
methods are logged into a file shipped within the flare. This means, in most cases, you can ignore errors returned by the FlareBuilder
methods. In all cases, ship as much data as possible in a flare instead of stopping at the first error.
Returning an error from your callback does not stop the flare from being created or sent. Rather, the error is logged into the flare too.
While it's possible to register multiple callbacks from the same component, try to keep all the flare code in a single callback.
"},{"location":"components/shared_features/flares/#register-your-callback","title":"Register your callback","text":"Now you need to register your callback to be called each time a flare is created. To do this, your component constructor needs to provide a new Provider. Use NewProvider function for this.
Example:
import (\n flare \"github.com/DataDog/datadog-agent/comp/core/flare/def\"\n)\n\ntype Provides struct {\n // [...]\n\n // Declare that your component will return a flare provider\n FlareProvider flare.Provider\n}\n\nfunc newComponent(deps Requires) Provides {\n // [...]\n\n return Provides{\n // [...]\n\n // NewProvider will wrap your callback in order to be use as a 'Provider'\n FlareProvider: flare.NewProvider(myComponent.fillFlare),\n }, nil\n}\n
"},{"location":"components/shared_features/flares/#testing","title":"Testing","text":"The flare component offers a FlareBuilder mock to test your callback.
Example:
import (\n \"testing\"\n \"github.com/DataDog/datadog-agent/comp/core/flare/helpers\"\n)\n\nfunc TestFillFlare(t testing.T) {\n myComp := newComponent(...)\n\n flareBuilderMock := helpers.NewFlareBuilderMock(t)\n\n myComp.fillFlare(flareBuilderMock, false)\n\n flareBuilderMock.AssertFileExists(\"datadog.yaml\")\n flareBuilderMock.AssertFileContent(\"some_file.txt\", \"my content\")\n // ...\n}\n
"},{"location":"components/shared_features/flares/#migrating-your-code","title":"Migrating your code","text":"Now comes the hard part: migrating the code from pkg/flare
related to your component to your new callback.
The good news is that the code in pkg/flare
already uses the FlareBuilder
interface. So you shouldn't need to rewrite any logic. Don't forget to migrate the tests too and expand them (most of the flare features are not tested).
Keep in mind that the goal is to delete pkg/flare
once the migration to component is done.
Components can register a status provider. When the status command is executed, we will populate the information displayed using all the status providers.
"},{"location":"components/shared_features/status/#status-providers","title":"Status Providers","text":"There are two types of status providers: - Header Providers: these providers are displayed at the top of the status output. This section is reserved for the most important information about the agent, such as agent version, hostname, host info, or metadata. - Regular Providers: these providers are rendered after all the header providers.
Each provider has the freedom to configure how they want to display their information for the three types of status output: JSON, Text, and HTML. This flexibility allows you to tailor the output to best suit your component's needs.
The JSON and Text outputs are displayed within the status CLI, while the HTML output is used for the Agent GUI.
To guarantee consistent output, we order the status providers internally. The ordering mechanism is different depending on the status provider. We order the header providers based on an index using the ascending direction. The regular providers are ordered alphabetically based on their names.
"},{"location":"components/shared_features/status/#header-providers-interface","title":"Header Providers Interface","text":"type HeaderProvider interface {\n // Index is used to choose the order in which the header information is displayed.\n Index() int\n // When displaying the Text output the name is render as a header\n Name() string\n JSON(verbose bool, stats map[string]interface{}) error\n Text(verbose bool, buffer io.Writer) error\n HTML(verbose bool, buffer io.Writer) error\n}\n
"},{"location":"components/shared_features/status/#regular-providers-interface","title":"Regular Providers Interface","text":"// Provider interface\ntype Provider interface {\n // Name is used to sort the status providers alphabetically.\n Name() string\n // Section is used to group the status providers.\n // When displaying the Text output the section is render as a header\n Section() string\n JSON(verbose bool, stats map[string]interface{}) error\n Text(verbose bool, buffer io.Writer) error\n HTML(verbose bool, buffer io.Writer) error\n}\n
"},{"location":"components/shared_features/status/#adding-a-status-provider","title":"Adding a status provider","text":"To add a status provider to your component, you need to declare it in the return value of its NewComponent()
function.
The status component provides helper functions to create status providers: NewInformationProvider
and NewHeaderInformationProvider
.
Also, the status component has helper functions to render text and HTML output: RenderText
and RenderHTML.
The signature for both functions is:
(templateFS embed.FS, template string, buffer io.Writer, data any)\n
The embed.FS
variable points to the location of the different status templates. These templates must be inside the component files. The folder must be named status_templates
. The name of the templates do not have any rules, but to keep the same consistency across the code, we suggest using \"<component>.tmpl\"
for the text template and \"<component>HTML.tmpl\"
for the HTML template.
Below is an example of adding a status provider to your component.
comp/compression/impl/compressor.gopackage impl\n\nimport (\n \"fmt\"\n\n compression \"github.com/DataDog/datadog-agent/comp/compression/def\"\n \"github.com/DataDog/datadog-agent/comp/status\"\n)\n\ntype Requires struct {\n}\n\ntype Provides struct {\n Comp compression.Component\n Status status.InformationProvider\n}\n\ntype compressor struct {\n}\n\n// NewComponent returns an implementation for the compression component\nfunc NewComponent(reqs Requires) Provides {\n comp := &compressor{}\n\n return Provides{\n Comp: comp,\n Status: status.NewInformationProvider(&comp)\n }\n}\n\n//\n// Since we are using the compressor as status provider we need to implement the status interface on our component\n//\n\n//go:embed status_templates\nvar templatesFS embed.FS\n\n// Name renders the name\nfunc (c *compressor) Name() string {\n return \"Compression\"\n}\n\n// Index renders the index\nfunc (c *compressor) Section() int {\n return \"Compression\"\n}\n\n// JSON populates the status map\nfunc (c *compressor) JSON(_ bool, stats map[string]interface{}) error {\n c.populateStatus(stats)\n\n return nil\n}\n\n// Text renders the text output\nfunc (c *compressor) Text(_ bool, buffer io.Writer) error {\n return status.RenderText(templatesFS, \"compressor.tmpl\", buffer, c.getStatusInfo())\n}\n\n// HTML renders the html output\nfunc (c *compressor) HTML(_ bool, buffer io.Writer) error {\n return status.RenderHTML(templatesFS, \"compressorHTML.tmpl\", buffer, c.getStatusInfo())\n}\n\nfunc (c *compressor) populateStatus(stats map[string]interface{}) {\n // Here we populate whatever informatiohn we want to display for our component\n stats[\"compressor\"] = ...\n}\n\nfunc (c *compressor) getStatusInfo() map[string]interface{} {\n stats := make(map[string]interface{})\n\n c.populateStatus(stats)\n\n return stats\n}\n
"},{"location":"components/shared_features/status/#testing","title":"Testing","text":"A critical part of your component development is ensuring that the status output is displayed as expected is. We highly encourage you to add tests to your components, giving you the confidence that your status output is accurate and reliable. For our example above, testing the status output is as easy as testing the result of calling JSON
, Text
and HTML
.
package impl\n\nimport (\n \"bytes\"\n \"testing\"\n)\n\nfunc TestText(t *testing.T) {\n requires := Requires{}\n\n provides := NewComponent(requires)\n component := provides.Comp\n buffer := new(bytes.Buffer)\n\n result, err := component.Text(false, buffer)\n assert.Nil(t, err)\n\n assert.Equal(t, ..., string(result))\n}\n\nfunc TestJSON(t *testing.T) {\n requires := Requires{}\n\n provides := NewComponent(requires)\n component := provides.Comp\n info := map[string]interface{}\n\n result, err := component.JSON(false, info)\n assert.Nil(t, err)\n\n assert.Equal(t, ..., result[\"compressor\"])\n}\n
To complete testing, we encourage adding the new status section output as part of the e2e tests. The CLI status e2e tests are in test/new-e2e/tests/agent-subcommands/status
folder.
First of all, thanks for contributing!
This document provides some basic guidelines for contributing to this repository. To propose improvements, feel free to submit a PR.
"},{"location":"guidelines/contributing/#submitting-issues","title":"Submitting issues","text":"Have you fixed a bug or written a new check and want to share it? Many thanks!
In order to ease/speed up our review, here are some items you can check/improve when submitting your PR:
Contributor ChecklistReviewer ChecklistHave a proper commit history (we advise you to rebase if needed) with clear commit messages.
Write tests for the code you wrote.
Preferably make sure that all tests pass locally.
Summarize your PR with an explanatory title and a message describing your changes, cross-referencing any related bugs/PRs.
Use Reno to create a release note.
Open your PR against the main
branch.
Provide adequate QA/testing plan information.
The added code comes with tests.
The CI is green, all tests are passing (required or not).
All applicable labels are set on the PR (see PR labels list).
If applicable, the config template has been updated.
Note
Adding GitHub labels is only possible for contributors with write access.
Your pull request must pass all CI tests before we will merge it. If you're seeing an error and don't think it's your fault, it may not be! Join us on Slack or send us an email, and together we'll get it sorted out.
"},{"location":"guidelines/contributing/#keep-it-small-focused","title":"Keep it small, focused","text":"Avoid changing too many things at once. For instance if you're fixing the NTP check and at the same time shipping a dogstatsd improvement, it makes reviewing harder and the time-to-release longer.
"},{"location":"guidelines/contributing/#commit-messages","title":"Commit Messages","text":"Please don't be this person: git commit -m \"Fixed stuff\"
. Take a moment to write meaningful commit messages.
The commit message should describe the reason for the change and give extra details that will allow someone later on to understand in 5 seconds the thing you've been working on for a day.
This includes editing the commit message generated by GitHub from:
Including new features\n\n* Fix linter\n* WIP\n* Add test for x86\n* Fix licenses\n* Cleanup headers\n
to:
Including new features\n\nThis feature does this and that. Some tests are excluded on x86 because of ...\n
If your commit is only shipping documentation changes or example files, and is a complete no-op for the test suite, please add [skip ci] in the commit message body to skip the build and give that slot to someone else who does need it.
"},{"location":"guidelines/contributing/#pull-request-workflow","title":"Pull request workflow","text":"The goals ordered by priority are:
main
branch, have a meaningful commit history that allows understanding (even years later) what each commit does, and why.You must open the PR when the code is reviewable or you must set the PR as draft if you want to share code before it's ready for actual reviews.
"},{"location":"guidelines/contributing/#before-the-first-pr-review","title":"Before the first PR review","text":"Before the first PR review, meaningful commits are best: logically-encapsulated commits help the reviews go quicker and make the job for the reviewer easier. Conflicts with main
can be resolved with a git rebase origin/main
and a force push if it makes future review(s) easier.
After the first review, to make follow-up reviews easier:
main
using git merge origin/main
main
","text":"Once reviews are complete, the merge to main
should be done with either:
main
clean (even though some context/details are lost in the squash). The commit message for this squash should always be edited to concisely describe the commit without extraneous \u201caddress review comments\u201d text.We use Reno
to create our CHANGELOG. Reno is a pretty simple tool.
Each PR should include a releasenotes
file created with reno
, unless the PR doesn't have any impact on the behavior of the Agent and therefore shouldn't be mentioned in the CHANGELOG (examples: repository documentation updates, changes in code comments). PRs that don't require a release note file will be labeled changelog/no-changelog
by maintainers.
To install reno: pip install reno
Ultra quick Reno
HOWTO:
$> reno new <topic-of-my-pr> --edit\n[...]\n# Remove unused sections and fill the relevant ones.\n# Reno will create a new file in releasenotes/notes.\n#\n# Each section from every release note are combined when the CHANGELOG.rst is\n# rendered. So the text needs to be worded so that it does not depend on any\n# information only available in another section. This may mean repeating some\n# details, but each section must be readable independently of the other.\n#\n# Each section note must be formatted as reStructuredText.\n[...]\n
Then just add and commit the new releasenote (located in releasenotes/notes/
) with your PR. If the change is on the trace-agent
(folders cmd/trace-agent
or pkg/trace
) please prefix the release note with \"APM :\" and the argument with \"apm-\"."},{"location":"guidelines/contributing/#reno-sections","title":"Reno sections","text":"
The main thing to keep in mind is that the CHANGELOG is written for the agent's users and not its developers.
features
: describe shortly what your feature does.
example:
features:\n - |\n Introducing the Datadog Process Agent for Windows.\n
enhancements
: describe enhancements here: new behavior that are too small to be considered a new feature.
example:
enhancements:\n - |\n Windows: Add PDH data to flare.\n
issues
: describe known issues or limitation of the agent.
example:
issues:\n - |\n Kubernetes 1.3 & OpenShift 3.3 are currently not fully supported: docker\n and kubelet integrations work OK, but apiserver communication (event\n collection, `kube_service` tagging) is not implemented\n
upgrade
: List actions to take or limitations that could arise upon upgrading the Agent. Notes here must include steps that users can follow to 1. know if they're affected and 2. handle the change gracefully on their end.
example:
upgrade:\n - |\n If you run a Nomad agent older than 0.6.0, the `nomad_group`\n tag will be absent until you upgrade your orchestrator.\n
deprecations
: List deprecation notes here.
example:
deprecations:\n- |\n Changed the attribute name to enable log collection from YAML configuration\n file from \"log_enabled\" to \"logs_enabled\", \"log_enabled\" is still\n supported.\n
security
: List security fixes, issues, warning or related topics here.
example:
security:\n - |\n The /agent/check-config endpoint has been patched to enforce\n authentication of the caller via a bearer session token.\n
fixes
: List the fixes done in your PR here. Remember to be clear and give a minimum of context so people reading the CHANGELOG understand what the fix is about.
example:
fixes:\n - |\n Fix EC2 tags collection when multiple marketplaces are set.\n
other
: Add here every other information you want in the CHANGELOG that don't feat in any other section. This section should rarely be used.
example:
other:\n - |\n Only enable the ``resources`` metadata collector on Linux by default, to match\n Agent 5's behavior.\n
For internal PRs (from people in the Datadog organization), you have few extra labels that can be use:
community/help-wanted
: for community PRs where help is needed to finish it.community
: for community PRs.changelog/no-changelog
: for PRs that don't require a reno releasenote (useful for PRs only changing documentation or tests).qa/done
or qa/no-code-change
: used to skip the QA week:
qa/done
label is recommended in case of code changes and manual / automated QA done before merge.qa/no-code-change
is recommended if there's no code changes in the Agent binary code.Important
Use qa/no-code-change
if your PR only changes tests or a module/package that does not end up in the Agent build. All of the following do not require QA:
major_change
: to flag the PR as a major change impacting many/all teams working on the agent and will require deeper QA (example: when we change the Python version shipped in the agent).
need-change/operator
, need-change/helm
: indicate that the configuration needs to be modified in the operator / helm chart as well.k8s/<min-version>
: indicate the lowest Kubernetes version compatible with the PR's feature.backport/<branch-name>
: Add this label to automatically create a PR against the <branch-name>
branch with your backported changes. The backport PR creation is triggered:
If there is a conflict, the bot prompts you with a list of instructions to follow (example) to manually backport your PR.
Also called checks, all officially supported Agent integrations live in the integrations-core repo. Please look there to submit related issues, PRs, or review the latest changes. For new integrations, please open a pull request in the integrations-extras repo.
"},{"location":"guidelines/docs/","title":"Writing developer docs","text":"This site is built by MkDocs and uses the Material for MkDocs theme.
You can serve documentation locally with the docs.serve
invoke task.
The site structure is defined by the nav
key in the mkdocs.yml
file.
When adding new pages, first think about what it is exactly that you are trying to document. For example, if you intend to write about something everyone must follow as a standard practice it would be classified as a guideline whereas a short piece about performing a particular task would be a how-to.
After deciding the kind of content, strive to further segment the page under logical groupings for easier navigation.
"},{"location":"guidelines/docs/#line-continuations","title":"Line continuations","text":"For prose where the rendered content should have no line breaks, always keep the Markdown on the same line. This removes the need for any stylistic enforcement and allows for IDEs to intelligently wrap as usual.
Tip
When you wish to force a line continuation but stay within the block, indent by 2 spaces from the start of the text and end the block with a new line. For example, the following shows how you would achieve a multi-line ordered list item:
Markdown1. first line\n\n second line\n\n1. third line\n
Rendered first line
second line
third line
When you want to call something out, use admonitions rather than making large chunks of text bold or italicized. The latter is okay for small spans within sentences.
Here's an example:
Markdown
!!! info\n Lorem ipsum ...\n
Rendered
Info
Lorem ipsum ...
Always use inline links rather than reference links.
The only exception to that rule is links that many pages may need to reference. Such links may be added to this file that all pages are able to reference.
"},{"location":"guidelines/docs/#abbreviations","title":"Abbreviations","text":"Abbreviations like DSD may be added to this file which will make it so that a tooltip will be displayed on hover.
"},{"location":"guidelines/deprecated-components-documentation/defining-apps/","title":"Defining Apps and Binaries","text":""},{"location":"guidelines/deprecated-components-documentation/defining-apps/#binaries","title":"Binaries","text":"Each binary is defined as a main
package in the cmd/
directory, such as cmd/iot-agent
. This top-level package contains only a simple main
function (or often, one for Windows and one for *nix) which performs any platform-specific initialization and then creates and executes a Cobra command.
Consider carefully the tree of Go imports that begins with the main
package. While the Go linker does some removal of unused symbols, the safest means to ensure a particular package isn't occuping space in the resulting binary is to not include it.
A \"simple binary\" here is one that does not have subcommands.
The Cobra configuration for the binary is contained in the command
subpackage of the main package (cmd/<binary>/command
). The main
function calls this package to create the command, and then executes it:
func main() {\n if err := command.MakeCommand().Execute(); err != nil {\n os.Exit(-1)\n }\n}\n
The command.MakeCommand
function creates the *cobra.Command
for the binary, with a RunE
field that defines an app, as described below.
Many binaries have a collection of subcommands, along with some command-line flags defined at the binary level. For example, the agent
binary has subcommands like agent flare
or agent diagnose
and accepts global --cfgfile
and --no-color
arguments.
As with simple binaries, the top-level Cobra command is defined by a MakeCommand
function in cmd/<binary>/command
. This command
package should also define a GlobalParams
struct and a SubcommandFactory
type:
// GlobalParams contains the values of agent-global Cobra flags.\n//\n// A pointer to this type is passed to SubcommandFactory's, but its contents\n// are not valid until Cobra calls the subcommand's Run or RunE function.\ntype GlobalParams struct {\n // ConfFilePath holds the path to the folder containing the configuration\n // file, to allow overrides from the command line\n ConfFilePath string\n\n // ...\n}\n\n// SubcommandFactory is a callable that will return a slice of subcommands.\ntype SubcommandFactory func(globalParams *GlobalParams) []*cobra.Command\n
Each subcommand is implemented in a subpackage of cmd/<binary>/subcommands
, such as cmd/<binary>/subcommands/version
. Each such subpackage contains a command.go
defining a Commands
function that defines the subcommands for that package:
func Commands(globalParams *command.GlobalParams) []*cobra.Command {\n cmd := &cobra.Command { .. }\n return []*cobra.Command{cmd}\n}\n
While Commands
typically returns only one command, it may make sense to return multiple commands when the implementations share substantial amounts of code, such as starting, stopping and restarting a service.
The main
function supplies a slice of subcommand factories to command.MakeCommand
, which calls each one and adds the resulting subcommands to the root command.
subcommandFactories := []command.SubcommandFactory{\n frobnicate.Commands,\n ...,\n}\nif err := command.MakeCommand(subcommandFactories).Execute(); err != nil {\n os.Exit(-1)\n}\n
The GlobalParams
type supports Cobra arguments that are global to all subcommands. It is passed to each subcommand factory so that the defined RunE
callbacks can access these arguments. If the binary has no global command-line arguments, it's OK to omit this type.
func MakeCommand(subcommandFactories []SubcommandFactory) *cobra.Command {\n globalParams := GlobalParams{}\n\n cmd := &cobra.Command{ ... }\n cmd.PersistentFlags().StringVarP(\n &globalParams.ConfFilePath, \"cfgpath\", \"c\", \"\",\n \"path to directory containing datadog.yaml\")\n\n for _, sf := range subcommandFactories {\n subcommands := sf(&globalParams)\n for _, cmd := range subcommands {\n agentCmd.AddCommand(cmd)\n }\n }\n\n return cmd\n}\n
If the available subcommands depend on build flags, move the creation of the subcommand factories to the subcommands/<command>
package and create the slice there using source files with //go:build
directives. Your factory can return nil
if your command is not compatible with the current build flag. In all cases, the subcommands build logic should be constrained to its package. See cmd/agent/subcommands/jmx/command_nojmx.go
for an example.
Apps map directly to fx.App
instances, and as such they define a set of provided components and instantiate some of them.
The fx.App
is always created after Cobra has parsed the command-line, within a cobra.Command#RunE
function. This means that the components supplied to an app, and any BundleParams values, are specific to the invoked command or subcommand.
A one-shot app is one which performs some task and exits, such as agent status
. The pkg/util/fxutil.OneShot
helper function provides a convenient shorthand to run a function only after all components have started. Use it like this:
cmd := cobra.Command{\n Use: \"foo\", ...,\n RunE: func(cmd *cobra.Command, args []string) error {\n return fxutil.OneShot(run,\n fx.Supply(core.BundleParams{}),\n core.Bundle(),\n ..., // any other bundles needed for this app\n )\n },\n}\n\nfunc run(log log.Component) error {\n log.Debug(\"foo invoked!\")\n ...\n}\n
The run
function typically also needs some command-line values. To support this, create a (sub)command-specific cliParams
type containing the required values, and embedding a pointer to GlobalParams:
type cliParams struct {\n *command.GlobalParams\n useTLS bool\n args []string\n}\n
Populate this type within Commands
, supply it as an Fx value, and require that value in the run
function:
func Commands(globalParams *command.GlobalParams) []*cobra.Command {\n cliParams := &cliParams{\n GlobalParams: globalParams,\n }\n var useTLS bool\n cmd := cobra.Command{\n Use: \"foo\", ...,\n RunE: func(cmd *cobra.Command, args []string) error {\n cliParams.args = args\n return fxutil.OneShot(run,\n fx.Supply(cliParams),\n fx.Supply(core.CreateaBundleParams()),\n core.Bundle(),\n ..., // any other bundles needed for this app\n )\n },\n }\n cmd.PersistentFlags().BoolVarP(&cliParams.useTLS, \"usetls\", \"\", \"\", \"force TLS use\")\n\n return []*cobra.Command{cmd}\n}\n\nfunc run(cliParams *cliParams, log log.Component) error {\n if (cliParams.Verbose) {\n log.Info(\"executing foo\")\n }\n ...\n}\n
This example includes cli params drawn from GlobalParams (Verbose
), from subcommand-specific args (useTLS
), and from Cobra (args
).
A daemon app is one that runs \"forever\", such as agent run
. Use the fxutil.Run
helper function for this variety of app:
cmd := cobra.Command{\n Use: \"foo\", ...,\n RunE: func(cmd *cobra.Command, args []string) error {\n return fxutil.Run(\n fx.Supply(core.BundleParams{}),\n core.Bundle(),\n ..., // any other bundles needed for this app\n fx.Supply(foo.BundleParams{}),\n foo.Bundle(), // the bundle implementing this app\n )\n },\n}\n
"},{"location":"guidelines/deprecated-components-documentation/defining-bundles/","title":"Defining Component Bundles","text":"A bundle is defined in a dedicated package named comp/<bundleName>
. The package must have the following defined in bundle.go
:
// team: <teamname>
. This is used to generate CODEOWNERS information.BundleParams
-- the type of the bundle's parameters (see below). This item should have a formulaic doc string like // BundleParams defines the parameters for this bundle.
Bundle
-- an fx.Option
that can be included in an fx.App
to make this bundle's components available. To assist with debugging, use fxutil.Bundle(options...)
. Use fx.Invoke(func(componentpkg.Component) {})
to instantiate components automatically. This item should have a formulaic doc string like // Module defines the fx options for this component.
Typically, a bundle will automatically instantiate the top-level components that represent the bundle's purpose. For example, the trace-agent bundle comp/trace
might automatically instantiate comp/trace/agent
.
You can use the invoke task deva components.new-bundle comp/<bundleName>
to generate a pre-filled bundle.go
file for the given bundle.
Apps can provide some intialization-time parameters to bundles. These parameters are limited to two kinds:
Anything else is runtime configuration and should be handled vi comp/core/config
or another mechanism.
Bundle parameters must stored only Params
types for sub components. The reason is that each sub component must be usable without BundleParams
.
import \".../comp/<bundleName>/foo\"\nimport \".../comp/<bundleName>/bar\"\n// ...\n\n// BundleParams defines the parameters for this bundle.\ntype BundleParams struct {\n Foo foo.Params\n Bar bar.Params\n}\n\nvar Bundle = fxutil.Bundle(\n // You must tell to fx how to get foo.Params from BundleParams.\n fx.Provide(func(params BundleParams) foo.Params { return params.Foo }),\n foo.Module(),\n // You must tell to fx how to get bar.Params from BundleParams.\n fx.Provide(func(params BundleParams) bar.Params { return params.Bar }),\n bar.Module(),\n)\n
"},{"location":"guidelines/deprecated-components-documentation/defining-bundles/#testing","title":"Testing","text":"A bundle should have a test file, bundle_test.go
, to verify the documentation's claim about its dependencies. This simply uses fxutil.TestBundle
to check that all dependencies are satisfied when given the full set of required bundles.
func TestBundleDependencies(t *testing.T) {\n fxutil.TestBundle(t, Bundle)\n}\n
"},{"location":"guidelines/deprecated-components-documentation/purpose/","title":"Purpose of component guidelines","text":"This section describes the mechanics of implementing apps, components, and bundles.
The guidelines are quite prescriptive, with the intent of making all components \"look the same\". This reduces cognitive load when using components -- no need to remember one component's peculiarities. It also allows Agent-wide changes, where we make the same formulaic change to each component. If a situation arises that contradicts the guidelines, then we can update the guidelines (and change all affected components).
"},{"location":"guidelines/deprecated-components-documentation/registrations/","title":"Component Registrations","text":"Components generally need to talk to one another! In simple cases, that occurs by method calls. But in many cases, a single component needs to communicate with a number of other components that all share some characteristics. For example, the comp/core/health
component monitors the health of many other components, and comp/workloadmeta/scheduler
provides workload events to an arbitrary number of subscribers.
The convention in the Agent codebase is to use value groups to accomplish this. The collecting component requires a slice of some collected type, and the providing components provide values of that type. Consider an example case of an HTTP server component to which endpoints can be attached. The server is the collecting component, requiring a slice of type []*endpoint
, where *endpoint
is the collected type. Providing components provide values of type *endpoint
.
The convention is to \"wrap\" the collected type in a Registration
struct type which embeds fx.Out
and has tag group:\"pkgname\"
, where pkgname
is the short package name (Fx requires a group name, and this is as good as any). This helps providing components avoid the common mistake of omitting the tag. Because it is wrapped in an exported Registration
type, the collected type can be an unexported type, as in the example below.
The collecting component should define the registration type and a constructor for it:
comp/server/component.go// ...\n// Server endpoints are provided by other components, by providing a server.Registration\n// instance.\n// ...\npackage server\n\ntype endpoint struct { // (the collected type)\n ...\n}\n\ntype Registration struct {\n fx.Out\n\n Endpoint endpoint `group:\"server\"`\n}\n\n// NewRegistration creates a new Registration instance for the given endpoint.\nfunc NewRegistration(route string, handler func()) Registration { ... }\n
Its implementation then requires a slice of the collected type (endpoint
), again using group:\"server\"
:
// endpoint defines an endpoint on this server.\ntype endpoint struct { ... }\n\ntype dependencies struct {\n fx.In\n\n Registrations []endpoint `group:\"server\"`\n}\n\nfunc newServer(deps dependencies) Component {\n // ...\n for _, e := range deps.Registrations {\n if e.handler == nil {\n continue\n }\n // ...\n }\n // ...\n}\n
It's good practice to ignore zero values, as that allows providing components to skip the registration if desired.
Finally, the providing component (in this case, foo
) includes a registration in its output as an additional provided type, beyond its Component
type:
func newFoo(deps dependencies) (Component, server.Registration) {\n // ...\n return foo, server.NewRegistration(\"/things/foo\", foo.handler)\n}\n
This technique has some caveats to be aware of:
Component
type is required. This may lead to components being instantiated in unexpected circumstances.Subscriptions are a common form of registration, and have support in the pkg/util/subscriptions
package.
In defining subscriptions, the component that transmits messages is the collecting component, and the processes receiving components are the providing components. These are matched using the message type, which must be unique across the codebase, and should not be a built-in type like string
. Providing components provide a subscriptions.Receiver[coll.Message]
which has a Ch
channel from which to receive messages. Collecting components require a subscriptions.Transmitter[coll.Message]
which has a Notify
method to send messages.
// ...\n// To subscribe to these announcements, provide a subscriptions.Subscription[announcer.Announcement].\n// ...\npackage announcer\n
func newAnnouncer(tx subscriptions.Transmitter[Anouncement]) Component {\n return &announcer{announcementTx: tx} // (store the transmitter)\n}\n\n// ... later send messages with\nfunc (ann *announcer) announce(a announcement) {\n ann.annoucementTx.Notify(a)\n}\n
func newListener() (Component, subscriptions.Receiver[announcer.Announcement]) {\n rx := subscriptions.Receiver[Event]() // create a receiver\n return &listener{announcementRx: rx}, rx // capture the receiver _and_ return it\n}\n\n// ... later receive messages (usually in an actor's main loop)\nfunc (l *listener) run() {\n loop {\n select {\n case a := <- l.announcementRx.Ch:\n ...\n }\n }\n}\n
Any component receiving messages via a subscription will automatically be instantiated by Fx if it is delcared in the app, regardless of whether its Component type is required by some other component. The workaround for this is to return a zero-valued Receiver when the component does not actually wish to receive messages (such as when the component is disabled by user configuration).
If a receiving component does not subscribe (for example, if it is not started), it can return the zero value, subscriptions.Receiver[Event]{}
, from its constructor. If a component returns a non-nil subscriber, it must consume messages from the receiver or risk blocking the transmitter.
See the pkg/util/subscriptions
documentation for more details.
Component dependencies are automatically determined from the arguments to a component constructor. Most components have a few dependencies, and use a struct named dependencies
to represent them:
type dependencies struct {\n fx.In\n\n Lc fx.Lifecycle\n Params internal.BundleParams\n Config config.Module\n Log log.Module\n // ...\n}\n\nfunc newThing(deps dependencies) Component {\n t := &thing{\n log: deps.Log,\n ...\n }\n deps.Lc.Append(fx.Hook{OnStart: t.start})\n return t\n}\n
"},{"location":"guidelines/deprecated-components-documentation/using-components/#testing","title":"Testing","text":"Testing for a component should use fxtest
to create the component. This focuses testing on the API surface of the component against which other components will be built. Per-function unit tests are, of course, also great where appropriate!
Here's an example testing a component with a mocked dependency on other
:
func TestMyComponent(t *testing.T) {\n var comp Component\n var other other.Component\n app := fxtest.New(t,\n Module, // use the real version of this component\n other.MockModule(), // use the mock version of other\n fx.Populate(&comp), // get the instance of this component\n fx.Populate(&other), // get the (mock) instance of the other component\n )\n\n // start and, at completion of the test, stop the components\n defer app.RequireStart().RequireStop()\n\n // cast `other` to its mock interface to call mock-specific methods on it\n other.(other.Mock).SetSomeValue(10) // Arrange\n comp.DoTheThing() // Act\n require.Equal(t, 20, other.(other.Mock).GetSomeResult()) // Assert\n}\n
If the component has a mock implementation, it is a good idea to test that mock implementation as well.
"},{"location":"hostname/hostname_force_config_as_canonical/","title":"Config-provided hostname starting withip-
or domu
","text":""},{"location":"hostname/hostname_force_config_as_canonical/#description-of-the-issue","title":"Description of the issue","text":"In v6 and v7 Agents, if hostname
is set in datadog.yaml
(or through the DD_HOSTNAME
env var) and its value starts with ip-
or domu
, the hostname is not used in-app as the canonical hostname, even if it is a valid hostname. More information about what a canonical hostname is can be found at How does Datadog determine the Agent hostname?.
To know if your Agents are affected, starting with v6.16.0 and v7.16.0, the Agent logs the following warning if it detects a situation where the config-provided hostname is a valid hostname but will not be accepted as the canonical hostname in-app: Hostname '<HOSTNAME>' defined in configuration are not used as the in-app hostname. For more information: https://dtdg.co/agent-hostname-force-config-as-canonical
If this warning is logged, you have the following options:
hostname
from datadog.yaml
(or the DD_HOSTNAME
env var) and restart the Agent; orip-
or domu
","text":"Starting with Agent v6.16.0 and v7.16.0, the Agent supports the config option hostname_force_config_as_canonical
(default: false
). When set to true
, a configuration-provided hostname starting with ip-
or domu
is accepted as the canonical hostname in-app:
The repository contains a few submodules. To add a new one and ensure it is tested, follow the following steps:
Create a directory for the module:
cd ~/my_path_to/datadog-agent && mkdir mymodule\n
Initialize a new Go module:
cd path/to/mymodule && go mod init\n
Create a dummy root package file doc.go
:
cat >doc.go <<EOL\n// Unless explicitly stated otherwise all files in this repository are licensed\n// under the Apache License Version 2.0.\n// This product includes software developed at Datadog (https://www.datadoghq.com/).\n// Copyright 2016-present Datadog, Inc.\npackage mymodule\nEOL\n
Udpate the modules.yml
file at the root of the repository with this content:
path/to/mymodule:\n independent: true\n should_tag: false\n test_targets:\n - .\n
independent
: Should it be importable as an independent module?should_tag
: Should the Agent pipeline tag it?test_targets
: Should go test
target specific subfolders?If you use your module in another module within datadog-agent
, add the require
and replace
directives in go.mod
.
From the other module root, install the dependency with go get
:
go get github.com/DataDog/datadog-agent/path/to/mymodule\n
Then add the replace directive in the go.mod
file: module github.com/DataDog/datadog-agent/myothermodule\ngo 1.18\n// Replace with local version\nreplace github.com/DataDog/datadog-agent/path/to/mymodule => ../path/to/mymodule\nrequire (\n github.com/DataDog/datadog-agent/path/to/mymodule v0.0.0-20230526143644-ed785d3a20d5\n)\n
Example PR: #17350 Welcome to the wonderful world of developing the Datadog Agent. Here we document how we do things, advanced debugging techniques, coding conventions & best practices, the internals of our testing infrastructure, and so much more.
If you are intrigued, continue reading. If not, continue all the same
"},{"location":"#getting-started","title":"Getting started","text":"First, you'll want to set up your development environment.
"},{"location":"#agent-development-guidelines","title":"Agent development guidelines","text":"To know more about the general design of the Agent and how to add code and feature read our section on Components.
"},{"location":"#navigation","title":"Navigation","text":"Desktop readers can use keyboard shortcuts to navigate.
Keys ActionTo build the agent on Windows, see datadog-agent-buildimages.
"},{"location":"setup/#linux-and-macos","title":"Linux and macOS","text":""},{"location":"setup/#python","title":"Python","text":"The Agent embeds a full-fledged CPython interpreter so it requires the development files to be available in the dev env. The Agent can embed Python 2 and/or Python 3, you will need development files for all versions you want to support.
If you're on OSX/macOS, installing Python 2.7 and/or 3.11 with Homebrew:
brew install python@2\nbrew install python@3.11\n
On Linux, depending on the distribution, you might need to explicitly install the development files, for example on Ubuntu:
sudo apt-get install python2.7-dev\nsudo apt-get install python3.11-dev\n
On Windows, install Python 2.7 and/or 3.11 via the official installer brings along all the development files needed:
Warning
If you don't use one of the Python versions that are explicitly supported, you may have problems running the built Agent's Python checks, especially if using a virtualenv. At this time, only Python 3.11 is confirmed to work as expected in the development environment.
"},{"location":"setup/#python-dependencies","title":"Python Dependencies","text":""},{"location":"setup/#preface","title":"Preface","text":"Invoke is a task runner written in Python that is extensively used in this project to orchestrate builds and test runs. To run the tasks, you need to have it installed on your machine. We offer two different ways to run our invoke tasks.
"},{"location":"setup/#deva-recommended","title":"deva
(recommended)","text":"The deva
CLI tool is a single binary that can be used to install and manage the development environment for the Agent, built by the Datadog team. It will install all the necessary Python dependencies for you. The development environment will be completely independent of your system Python installation. This tool leverages PyApp, a wrapper for Python applications that bootstrap themselves at runtime. In our case, we wrap invoke
itself and include the dependencies needed to work on the Agent.
To install deva
, you'll need to:
deva
in place of invoke
or inv
.The Python environment will automatically be created on the first run. and will be reused for subsequent runs. For example:
cd datadog-agent\ncurl -L -o deva https://github.com/DataDog/datadog-agent-devtools/releases/download/deva-v1.0.0/deva-aarch64-unknown-linux-gnu-1.0.0\nchmod +x deva\n./deva linter.go\n
Below a live demo of how the tool works:
If you want to uninstall deva
, you can simply run the ./deva self remove
command, which will remove the virtual environment from your system, and remove the binary. That's it.
To protect and isolate your system-wide python installation, a python virtual environment is highly recommended (though optional). It will help keep a self-contained development environment and ensure a clean system Python.
Note
Due to the way some virtual environments handle executable paths (e.g. python -m venv
), not all virtual environment options will be able to run the built Agent correctly. At this time, the only confirmed virtual environment creator that is known for sure to work is virtualenv
.
python3 -m pip install virtualenv\n
virtualenv $GOPATH/src/github.com/DataDog/datadog-agent/venv\n
If using virtual environments when running the built Agent, you may need to override the built Agent's search path for Python check packages using the PYTHONPATH
variable (your target path must have the pre-requisite core integration packages installed though).
PYTHONPATH=\"./venv/lib/python3.11/site-packages:$PYTHONPATH\" ./agent run ...\n
See also some notes in ./checks about running custom python checks.
"},{"location":"setup/#install-invoke-and-its-dependencies","title":"Install Invoke and its dependencies","text":"Our invoke tasks are only compatible with Python 3, thus you will need to use Python 3 to run them.
Though you may install invoke in a variety of way we suggest you use the provided requirements file and pip
:
pip install -r tasks/requirements.txt\n
This procedure ensures you not only get the correct version of invoke
, but also any additional python dependencies our development workflow may require, at their expected versions. It will also pull other handy development tools/deps (reno
, or docker
).
You must install Golang version 1.23.3
or later. Make sure that $GOPATH/bin
is in your $PATH
otherwise invoke
cannot use any additional tool it might need.
Note
Versions of Golang that aren't an exact match to the version specified in our build images (see e.g. here) may not be able to build the agent and/or the rtloader binary properly.
"},{"location":"setup/#installing-tooling","title":"Installing tooling","text":"From the root of datadog-agent
, run invoke install-tools
to install go tooling. This uses go
to install the necessary dependencies.
When working on the Agent codebase you can choose among two different ways to build the binary, informally named System and Embedded builds. For most contribution scenarios you should rely on the System build (the default) and use the Embedded one only for specific use cases. Let's explore the differences.
"},{"location":"setup/#system-build","title":"System build","text":"System builds use your operating system's standard system libraries to satisfy the Agent's external dependencies. Since, for example, macOS 10.11 may provide a different version of Python than macOS 10.12, system builds on each of these platforms may produce different Agent binaries. If this doesn't matter to you\u2014perhaps you just want to contribute a quick bugfix\u2014do a System build; it's easier and faster than an Embedded build. System build is the default for all build and test tasks, so you don't need to configure anything there. But to make sure you have system copies of all the Agent's dependencies, skip the Embedded build section below and read on to see how to install them via your usual package manager (apt, yum, brew, etc).
"},{"location":"setup/#embedded-build","title":"Embedded build","text":"Embedded builds download specifically-versioned dependencies and compile them locally from sources. We run Embedded builds to create Datadog's official Agent releases (i.e. RPMs, debs, etc), and while you can run the same builds while developing locally, the process is as slow as it sounds. Hence, you should only use them when you care about reproducible builds. For example:
Embedded builds rely on Omnibus to download and build dependencies, so you need a recent ruby
environment with bundler
installed. See how to build Agent packages with Omnibus for more details.
The agent is able to collect systemd journal logs using a wrapper on the systemd utility library.
On Ubuntu/Debian:
sudo apt-get install libsystemd-dev\n
On Redhat/CentOS:
sudo yum install systemd-devel\n
"},{"location":"setup/#docker","title":"Docker","text":"If you want to build a Docker image containing the Agent, or if you wan to run system and integration tests you need to run a recent version of Docker in your dev environment.
"},{"location":"setup/#doxygen","title":"Doxygen","text":"We use Doxygen to generate the documentation for the rtloader
part of the Agent.
To generate it (using the invoke rtloader.generate-doc
command), you'll need to have Doxygen installed on your system and available in your $PATH
. You can compile and install Doxygen from source with the instructions available here. Alternatively, you can use already-compiled Doxygen binaries from here.
To get the dependency graphs, you may also need to install the dot
executable from graphviz and add it to your $PATH
.
It is optional but recommended to install pre-commit
to run a number of checks done by the CI locally.
To install it, run:
python3 -m pip install pre-commit\npre-commit install\n
The shellcheck
pre-commit hook requires having the shellcheck
binary installed and in your $PATH
. To install it, run:
deva install-shellcheck --destination <path>\n
(by default, the shellcheck binary is installed in /usr/local/bin
).
pre-commit
","text":"If you want to skip pre-commit
for a specific commit you can add --no-verify
to the git commit
command.
pre-commit
manually","text":"If you want to run one of the checks manually, you can run pre-commit run <check name>
.
You can run it on all files with the --all-files
flag.
pre-commit run flake8 --all-files # run flake8 on all files\n
See pre-commit run --help
for further options.
Microsoft Visual Studio Code with the devcontainer plugin allow to use a container as remote development environment in vscode. It simplify and isolate the dependencies needed to develop in this repository.
To configure the vscode editor to use a container as remote development environment you need to:
deva vscode.setup-devcontainer --image \"<image name>\"
. This command will create the devcontainer configuration file ./devcontainer/devcontainer.json
.Microsoft Visual Studio Code is recommended as it's lightweight and versatile.
Building on Windows requires multiple 3rd-party software to be installed. To avoid the complexity, Datadog recommends to make the code change in VS Code, and then do the build in Docker image. For complete information, see Build the Agent packages
"},{"location":"architecture/dogstatsd/internals/","title":"DogStatsD internals","text":"(click to enlarge)
Information on DogStatsD, configuration and troubleshooting is available in the Datadog documentation.
"},{"location":"architecture/dogstatsd/internals/#packet","title":"Packet","text":"In DogStatsD, a Packet is a bytes array containing one or multiple metrics in the DogStatsD format (separated by a \\n
when there are several). Its maximum size is dogstatsd_buffer_size
.
\\n
The PacketAssembler gathers multiple datagrams into one Packet of maximum size, dogstatsd_buffer_size
, and sends it to the PacketsBuffer which avoids running the whole parsing pipeline with only one metric per packet. The bytes buffer used comes from the PacketPool, which avoids re-allocating the bytes buffer every time.
Note
The UDS pipeline does not use the PacketAssembler because each UDS packet also contains metadata (origin tags) which are used to enrich the metrics tags, making them impossible to be packed together by the PacketAssembler.
The PacketAssembler does not allocate a bytes array every time it has to use one. It retrieves one from a pool containing pre-allocated arrays and this pool never empties. The PacketAssembler allocates a new bytes array when it\u2019s needed. Once fully assembled by the PacketAssembler, the bytes array is sent through the rest of the DogStatsD pipeline and ownership is allocated to each part using it (PacketsBuffer, Worker). Eventually, the Worker takes care of returning it to the pool when the part has processed its content.
"},{"location":"architecture/dogstatsd/internals/#packetbuffer","title":"PacketBuffer","text":"\\n
)The PacketsBuffer buffers multiple Packets (in a slice), this way the parsing part of the pipeline is going through several Packets in a row instead of only one each time it is called. This leads to less CPU usage. PacketsBuffer sends the Packets for processing when either:
a. The buffer is full (contains dogstatsd_packet_buffer_size, default value: 32
)
or
b. A timer is triggered (i.e. dogstatsd_packer_buffer_flush_timeout, default value: 100ms
)
The PacketBuffer sends it in a Go buffered channel to the worker / parser, meaning that the channels can buffer the Packets on their own while waiting for the worker to read and process them.
In theory, the max memory usage of this Go buffered channel is:
dogstatsd_packer_buffer_size
* dogstatsd_buffer_size
* dogstatsd_queue_size
To this we can add per-listener buffers: dogstatsd_packer_buffer_size
* dogstatsd_buffer_size
* connections
. connections
will be 1 for uds
and udp
and one per client for uds-stream
.
The Worker is the part of the DogStatsD server responsible for parsing the metrics in the bytes array and turning them into MetricSamples.
The server spawns multiple workers based on the amount of cores available on the host:
(number of cores - 2)
workers. If this result is less than 2, the server spawns 2 workers.(number of cores / 2)
workers. If this result is less than 2, the server spawns 2 workers.The Worker is using a system called StringInterner to not allocate memory every time a string is needed. Note that this StringInterner is caching a finite number of strings and when it is full it is emptied to start caching strings again. Its size is configurable with dogstatsd_string_interner_size
.
The MetricSamples created are not directly sent to the Agent Demultiplexer but first to a part called the Batcher.
"},{"location":"architecture/dogstatsd/internals/#batcher","title":"Batcher","text":"The role of the Batcher is to accumulate multiple MetricSamples before sending them to the Agent Demultiplexer. Every time it accumulates 32 MetricSamples, the Batcher sends them to the Demultiplexer. The Batcher sends 32 MetricSamples in a channel buffering 100 sets. There is one channel per TimeSampler.
The size of a MetricSample depends on the size of the host's hostname, its metric name, and its number of tags. An example MetricSample with a 20 character hostname, 40 character metric name, and 200 characters of tags has a size of approximately 264 bytes. A Batcher can use a maximum of 844kb of memory.
"},{"location":"architecture/dogstatsd/internals/#timesamplerworker","title":"TimeSamplerWorker","text":"The TimeSamplerWorker runs in an infinite loop. It is responsible for the following:
The following calculations determine the number of TimeSamplerWorker and TimeSampler instances:
dogstatsd_pipeline_autoadjust
is true
then the workers count will be automatically adjusted.dogstatsd_pipeline_count
has a value, the number of TimeSampler pipelines equals that value.dogstatsd_pipeline_autoadjust_strategy
can be set to the following values:
max_throughput
: The number of TimeSampler pipelines is adjusted to maximize throughput. There are (number of core/2) - 1
instances of TimeSampler.per_origin
: The number of TimeSampler pipelines is adjusted to improve data locality. The number of dsdWorker instances is equal to half the number of cores. and the number of TimeSampler pipelines is equal dogstatsd_pipeline_count
or twice the number of cores. This strategy will provide a better compression ratio in shared environments and improve resource allocation fairness within the agent.The NoAggregationStreamWorker runs an infinite loop in a goroutine. It receives metric samples with timestamps, and it batches them to be sent as quickly as possible to the intake. It performs no aggregation nor extra processing, except from adding tags to the metrics.
It runs only when dogstatsd_no_aggregation_pipeline
is set to true
.
The payload being sent to the intake (through the normal Serializer
/Forwarder
pieces) contains, at maximum, dogstatsd_no_aggregation_pipeline_batch_size
metrics. This value defaults to 2048
.
Fx groups help you produce and group together values of the same type, even if these values are produced in different parts of the codebase. A component can add any type into a group; this group can then consumed by other components.
In the following example, a component add a server.Endpoint
type to the server
group.
type Provides struct {\n comp Component\n Endpoint server.Endpoint `group:\"server\"`\n}\n
In the following example, a component requests all the types added to the server
group. This takes the form of a slice received at instantiation.
type Requires struct {\n Endpoints []Endpoint `group:\"server\"`\n}\n
"},{"location":"components/creating-bundles/","title":"Creating a bundle","text":"A bundle is a grouping of related components. The goal of a bundle is to ease the usage of multiple components working together to constitute a product.
One example is DogStatsD
, a server to receive metrics locally from customer apps. DogStatsD
is composed of 9+ components, but at the binary level we want to include DogStatsD
as a whole.
For use cases like that of DogStatsD, create a bundle.
"},{"location":"components/creating-bundles/#creating-a-bundle_1","title":"Creating a bundle","text":"A bundle eases the aggregation of multiple components and lives in comp/<bundlesName>/
.
// Package <bundleName> ...\npackage <bundleName>\n\nimport (\n \"github.com/DataDog/datadog-agent/pkg/util/fxutil\"\n\n // We import all the components that we want to aggregate. A bundle must only aggregate components within its\n // sub-folders.\n comp1fx \"github.com/DataDog/datadog-agent/comp/<bundleName>/comp1/fx\"\n comp2fx \"github.com/DataDog/datadog-agent/comp/<bundleName>/comp2/fx\"\n comp3fx \"github.com/DataDog/datadog-agent/comp/<bundleName>/comp3/fx\"\n comp4fx \"github.com/DataDog/datadog-agent/comp/<bundleName>/comp4/fx\"\n)\n\n// A single team must own the bundle, even if they don't own all the sub-components\n// team: <the team owning the bundle>\n\n// Bundle defines the fx options for this bundle.\nfunc Bundle() fxutil.BundleOptions {\n return fxutil.Bundle(\n comp1fx.Module(),\n comp2fx.Module(),\n comp3fx.Module(),\n comp4fx.Module(),\n}\n
A bundle doesn't need to import all sub components. The idea is to offer a default, easy to use grouping of components. But nothing prevents users from cherry-picking the components they want to use.
"},{"location":"components/creating-components/","title":"Creating a Component","text":"This page explains how to create components in detail.
This page uses the example of creating a compression component. This component compresses a payload before sending it to the Datadog backend.
Since there are multiple ways to compress data, this component provides two implementations of the same interface:
A component contains multiple folders and Go packages. Developers split a component into packages to isolate the interface from the implementations and improve code sharing. Declaring the interface in a separate package from the implementation allows you to import the interface without importing all of the implementations.
"},{"location":"components/creating-components/#file-hierarchy","title":"File hierarchy","text":"All components are located in the comp
folder at the top of the Agent repo.
The file hierarchy is as follows:
comp /\n <bundle name> / <-- Optional\n <comp name> /\n def / <-- The folder containing the component interface and ALL its public types.\n impl / <-- The only or primary implementation of the component.\n impl-<alternate> / <-- An alternate implementation.\n impl-none / <-- Optional. A noop implementation.\n fx / <-- All fx related logic for the primary implementation, if any.\n fx-<alternate> / <-- All fx related logic for a specific implementation.\n mock / <-- The mock implementation of the component to ease testing.\n
To note:
impl
folder.impl-<version>
folders instead of an impl
folder. For example, your compression component has impl-zstd
and impl-zip
folders, but not an impl
folder.impl-none
folder.This file hierarchy aims to solve a few problems:
def
folders and never care about which implementation was loaded in the main function.fx
folder per implementation, to allow binaries to import/link against a single folder.You can use the invoke task deva components.new-component comp/<COMPONENT_NAME>
to generate a scaffold for your new component.
Every public variable, function, struct, and interface of your component must be documented. Refer to the Documentation section below for details.
"},{"location":"components/creating-components/#the-def-folder","title":"The def folder","text":"The def
folder contains your interface and ALL public types needed by the users of your component.
In the example of a compression component, the def folder looks like this:
comp/compression/def/component.go// Package compression contains all public type and interfaces for the compression component\npackage compression\n\n// team: <your team>\n\n// Component describes the interface implemented by all compression implementations.\ntype Component interface {\n // Compress compresses the input data.\n Compress([]byte) ([]byte, error)\n\n // Decompress decompresses the input data.\n Decompress([]byte) ([]byte, error)\n}\n
All component interfaces must be called Component
, so all imports have the form <COMPONENT_NAME>.Component
.
You can see that the interface only exposes the bare minimum. You should aim at having the smallest possible interface for your component.
When defining a component interface, avoid using structs or interfaces from third-party dependencies.
Interface using a third-party dependency
package telemetry\n\nimport \"github.com/prometheus/client_golang/prometheus\"\n\n// team: agent-shared-components\n\n// Component is the component type.\ntype Component interface {\n // RegisterCollector Registers a Collector with the prometheus registry\n RegisterCollector(c prometheus.Collector)\n}\n
In the example above, every user of the telemetry
component would have to import github.com/prometheus/client_golang/prometheus
no matter which implementation they use.
In general, be mindful of using external types in the public interface of your component. For example, it would make sense to use Docker types in a docker
component, but not in a container
component.
The impl
folder is where the component implementation is written. The details of component implementation are up to the developer. The only requirements are that the package name follows the pattern <COMPONENT_NAME>impl
and that there is a public instantiation function called NewComponent
.
package compressionimpl\n\n// NewComponent returns a new ZSTD implementation for the compression component\nfunc NewComponent(reqs Requires) Provides {\n ....\n}\n
To require input arguments to the NewComponent
instantiation function, use a special struct named Requires
. The instantiation function returns a special stuct named Provides
. This internal nomenclature is used to handle the different component dependencies using Fx groups.
In this example, the compression component must access the configuration component and the log component. To express this, define a Requires
struct with two fields. The name of the fields is irrelevant, but the type must be the concrete type of interface that you require.
package compressionimpl\n\nimport (\n \"fmt\"\n\n config \"github.com/DataDog/datadog-agent/comp/core/config/def\"\n log \"github.com/DataDog/datadog-agent/comp/core/log/def\"\n)\n\n// Here, list all components and other types known by Fx that you need.\n// To be used in `fx` folders, type and field need to be public.\n//\n// In this example, you need config and log components.\ntype Requires struct {\n Conf config.Component\n Log log.Component\n}\n
Using other components
If you want to use another component within your own, add it to the Requires
struct, and Fx
will give it to you at initialization. Be careful of circular dependencies.
For the output of the component, populate the Provides
struct with the return values.
package compressionimpl\n\nimport (\n // Always import the component def folder, so that you can return a 'compression.Component' type.\n compression \"github.com/DataDog/datadog-agent/comp/compression/def\"\n)\n\n// Here, list all the types your component is going to return. You can return as many types as you want; all of them are available through Fx in other components.\n// To be used in `fx` folders, type and field need to be public.\n//\n// In this example, only the compression component is returned.\ntype Provides struct {\n Comp compression.Component\n}\n
All together, the component code looks like the following:
comp/compression/impl-zstd/compressor.gopackage compressionimpl\n\nimport (\n \"fmt\"\n\n compression \"github.com/DataDog/datadog-agent/comp/compression/def\"\n config \"github.com/DataDog/datadog-agent/comp/core/config/def\"\n log \"github.com/DataDog/datadog-agent/comp/core/log/def\"\n)\n\ntype Requires struct {\n Conf config.Component\n Log log.Component\n}\n\ntype Provides struct {\n Comp compression.Component\n}\n\n// The actual type implementing the 'Component' interface. This type MUST be private, you need the guarantee that\n// components can only be used through their respective interfaces.\ntype compressor struct {\n // Keep a ref on the config and log components, so that you can use them in the 'compressor' methods\n conf config.Component\n log log.Component\n\n // any other field you might need\n}\n\n// NewComponent returns a new ZSTD implementation for the compression component\nfunc NewComponent(reqs Requires) Provides {\n // Here, do whatever is needed to build a ZSTD compression comp.\n\n // And create your component\n comp := &compressor{\n conf: reqs.Conf,\n log: reqs.Log,\n }\n\n return Provides{\n comp: comp,\n }\n}\n\n//\n// You then need to implement all methods from your 'compression.Component' interface\n//\n\n// Compress compresses the input data using ZSTD\nfunc (c *compressor) Compress(data []byte) ([]byte, error) {\n c.log.Debug(\"compressing a buffer with ZSTD\")\n\n // [...]\n return compressData, nil\n}\n\n// Decompress decompresses the input data using ZSTD.\nfunc (c *compressor) Decompress(data []byte) ([]byte, error) {\n c.log.Debug(\"decompressing a buffer with ZSTD\")\n\n // [...]\n return compressData, nil\n}\n
The constructor can return either a Provides
, if it is infallible, or (Provides, error)
, if it could fail. In the latter case, a non-nil error results in the Agent crashing at startup with a message containing the error.
Each implementation follows the same pattern.
"},{"location":"components/creating-components/#the-fx-folders","title":"The fx folders","text":"The fx
folder must be the only folder importing and referencing Fx. It's meant to be a simple wrapper. Its only goal is to allow dependency injection with Fx for your component.
All fx.go
files must define a func Module() fxutil.Module
function. The helpers contained in fxutil
handle all the logic. Most fx/fx.go
file should look the same as this:
package fx\n\nimport (\n \"github.com/DataDog/datadog-agent/pkg/util/fxutil\"\n\n // You must import the implementation you are exposing through FX\n compressionimpl \"github.com/DataDog/datadog-agent/comp/compression/impl-zstd\"\n)\n\n// Module specifies the compression module.\nfunc Module() fxutil.Module {\n return fxutil.Component(\n // ProvideComponentConstructor will automatically detect the 'Requires' and 'Provides' structs\n // of your constructor function and map them to FX.\n fxutil.ProvideComponentConstructor(\n compressionimpl.NewComponent,\n )\n )\n}\n
Optional dependencies
To create an optional wrapper type for your component, you can use the helper function fxutil.ProvideOptional
. This generic function requires the type of the component interface, and will automatically make a conversion function optional.Option
for that component.
More on this in the FAQ.
For the ZIP implementation, create the same file in fx-zip
folder. In most cases, your component has a single implementation. If so, you have only one impl
and fx
folder.
fx-none
","text":"Some parts of the codebase might have optional dependencies on your components (see FAQ).
If it's the case, you need to provide a fx wrapper called fx-none
to avoid duplicating the use of optional.NewNoneOption[def.Component]()
in all our binaries
import (\n compression \"github.com/DataDog/datadog-agent/comp/compression/def\"\n)\n\nfunc Module() fxutil.Module {\n return fxutil.Component(\n fx.Provide(func() optional.Option[compression.Component] {\n return optional.NewNoneOption[compression.Component]()\n }))\n}\n
"},{"location":"components/creating-components/#the-mock-folder","title":"The mock folder","text":"To support testing, components MUST provide a mock implementation (unless your component has no public method in its interface).
Your mock must implement the Component
interface of the def
folder but can expose more methods if needed. All mock constructors must take a *testing.T
as parameter.
In the following example, your mock has no dependencies and returns the same string every time.
comp/compression/mock/mock.go//go:build test\n\npackage mock\n\nimport (\n \"testing\"\n\n compression \"github.com/DataDog/datadog-agent/comp/compression/def\"\n)\n\ntype Provides struct {\n Comp compression.Component\n}\n\ntype mock struct {}\n\n// New returns a mock compressor\nfunc New(*testing.T) Provides {\n return Provides{\n comp: &mock{},\n }\n}\n\n// Compress compresses the input data using ZSTD\nfunc (c *mock) Compress(data []byte) ([]byte, error) {\n return []byte(\"compressed\"), nil\n}\n\n// Decompress decompresses the input data using ZSTD.\nfunc (c *compressor) Decompress(data []byte) ([]byte, error) {\n return []byte(\"decompressed\"), nil\n}\n
"},{"location":"components/creating-components/#go-module","title":"Go module","text":"Go modules are not mandatory, but if you want to allow your component to be used outside the datadog-agent
repository, create Go modules in the following places:
impl
/impl-*
folder that you want to expose (you can only expose some implementations).def
folder to expose the interfacemock
folder to expose the mockNever add a Go module to the component folder (for example,comp/compression
) or any fx
folders.
In the end, a classic component folder should look like:
comp/<COMPONENT_NAME>/\n\u251c\u2500\u2500 def\n\u2502 \u2514\u2500\u2500 component.go\n\u251c\u2500\u2500 fx\n\u2502 \u2514\u2500\u2500 fx.go\n\u251c\u2500\u2500 impl\n\u2502 \u2514\u2500\u2500 component.go\n\u2514\u2500\u2500 mock\n \u2514\u2500\u2500 mock.go\n\n4 directories, 4 files\n
The example compression component, which has two implementations, looks like:
comp/core/compression/\n\u251c\u2500\u2500 def\n\u2502 \u2514\u2500\u2500 component.go\n\u251c\u2500\u2500 fx-zip\n\u2502 \u2514\u2500\u2500 fx.go\n\u251c\u2500\u2500 fx-zstd\n\u2502 \u2514\u2500\u2500 fx.go\n\u251c\u2500\u2500 impl-zip\n\u2502 \u2514\u2500\u2500 component.go\n\u251c\u2500\u2500 impl-zstd\n\u2502 \u2514\u2500\u2500 component.go\n\u2514\u2500\u2500 mock\n \u2514\u2500\u2500 mock.go\n\n6 directories, 6 files\n
This can seem like a lot for a single compression component, but this design answers the exponentially increasing complexity of the Agent ecosystem. Your component needs to behave correctly with many binaries composed of unique and shared components, outside repositories that want to pull only specific features, and everything in between.
Important
No components know how or where they will be used and MUST, therefore, respect all the rules above. It's a very common pattern for teams to work only on their use cases, thinking their code will not be used anywhere else. But customers want common behavior between all Datadog products (Agent, serverless, Agentless, Helm, Operator, etc.).
A key idea behind the component is to produce shareable and reusable code.
"},{"location":"components/creating-components/#general-consideration-about-designing-components","title":"General consideration about designing components","text":"Your component must:
The documentation (both package-level and method-level) should include everything a user of the component needs to know. In particular, the documentation must address any assumptions that might lead to panic if violated by the user.
Detailed documentation of how to avoid bugs in using a component is an indicator of excessive complexity and should be treated as a bug. Simplifying the usage will improve the robustness of the Agent.
Documentation should include:
Precise information about data ownership of passed values and returned values. Users can assume that any mutable value returned by a component will not be modified by the user or the component after it is returned. Similarly, any mutable value passed to a component will not be later modified, whether by the component or the caller. Any deviation from these defaults should be documented.
Note
It can be surprisingly hard to avoid mutating data -- for example, append(..)
surprisingly mutates its first argument. It is also hard to detect these bugs, as they are often intermittent, cause silent data corruption, or introduce rare data races. Where performance is not an issue, prefer to copy mutable input and outputs to avoid any potential bugs.
Precise information about goroutines and blocking. Users can assume that methods do not block indefinitely, so blocking methods should be documented as such. Methods that invoke callbacks should be clear about how the callback is invoked, and what it might do. For example, document whether the callback can block, and whether it might be called concurrently with other code.
You might need to express the fact that some of your dependencies are optional. This often happens for components that interact with many other components if available (that is, if they were included at compile time). This allows your component to interact with each other without forcing their inclusion in the current binary.
The optional.Option type answers this need.
For examples, consider the metadata components that are included in multiple binaries (core-agent
, DogStatsD
, etc.). These components use the sysprobeconfig
component if it is available. sysprobeconfig
is available in the core-agent
but not in DogStatsD
.
To do this in the metadata
component:
type Requires struct {\n SysprobeConf optional.Option[sysprobeconfig.Component]\n [...]\n}\n\nfunc NewMetadata(deps Requires) (metadata.Component) {\n if sysprobeConf, found := deps.SysprobeConf.Get(); found {\n // interact with sysprobeconfig\n }\n}\n
The above code produces a generic component, included in both core-agent
and DogStatsD
binaries, that can interact with sysprobeconfig
without forcing the binaries to compile with it.
You can use this pattern for every component, since all components provide Fx with a conversion function to convert their Component
interfaces to optional.Option[Component]
(see creating components).
The Agent uses Fx as its application framework. While the linked Fx documentation is thorough, it can be a bit difficult to get started with. This document describes how Fx is used within the Agent in a more approachable style.
"},{"location":"components/fx/#what-is-it","title":"What Is It?","text":"Fx's core functionality is to create instances of required types \"automatically,\" also known as dependency injection. Within the agent, these instances are components, so Fx connects components to one another. Fx creates a single instance of each component, on demand.
This means that each component declares a few things about itself to Fx, including the other components it depends on. An \"app\" then declares the components it contains to Fx, and instructs Fx to start up the whole assembly.
"},{"location":"components/fx/#providing-and-requiring","title":"Providing and Requiring","text":"Fx connects components using types. Within the Agent, these are typically interfaces named Component
. For example, scrubber.Component
might be an interface defining functionality for scrubbing passwords from data structures:
type Component interface {\n ScrubString(string) string\n}\n
Fx needs to know how to provide an instance of this type when needed, and there are a few ways:
fx.Provide(NewScrubber)
where NewScrubber
is a constructor that returns a scrubber.Component
. This indicates that if and when a scrubber.Component
is required, Fx should call NewScrubber
. It will call NewScrubber
only once, using the same value everywhere it is required.fx.Supply(scrubber)
where scrubber
implements the scrubber.Component
interface. When another component requires a scrubber.Component
, this is the instance it will get.The first form is much more common, as most components have constructors that do interesting things at runtime. A constructor can return multiple arguments, in which case the constructor is called if any of those argument types are required. Constructors can also return error
as the final return type. Fx will treat an error as fatal to app startup.
Fx also needs to know when an instance is required, and this is where the magic happens. In specific circumstances, it uses reflection to examine the argument list of functions, and creates instances of each argument's type. Those circumstances are:
fx.Provide
. Imagine NewScrubber
depends on the config module to configure secret matchers: func NewScrubber(config config.Component) Component {\n return &scrubber{\n matchers: makeMatchersFromConfig(config),\n }\n}\n
fx.Invoke
: fx.Invoke(func(sc scrubber.Component) {\n fmt.Printf(\"scrubbed: %s\", sc.ScrubString(somevalue))\n})\n
Like constructors, Invoked functions can take multiple arguments, and can optionally return an error. Invoked functions are called automatically when an app is created.Pointers passed to fx.Populate
.
var sc scrubber.Component\n// ...\nfx.Populate(&sc)\n
Populate is useful in tests to fill an existing variable with a provided value. It's equivalent to fx.Invoke(func(tmp scrubber.Component) { *sc = tmp })
. Functions can take multple arguments of different types, requiring all of them.
You may have noticed that all of the fx
methods defined so far return an fx.Option
. They don't actually do anything on their own. Instead, Fx uses the functional options pattern from Rob Pike. The idea is that a function takes a variable number of options, each of which has a different effect on the result.
In Fx's case, the function taking the options is fx.New
, which creates a new fx.App
. It's within the context of an app that requirements are met, constructors are called, and so on.
Tying the example above together, a very simple app might look like this:
someValue = \"my password is hunter2\"\napp := fx.New(\n fx.Provide(scrubber.NewScrubber),\n fx.Invoke(func(sc scrubber.Component) {\n fmt.Printf(\"scrubbed: %s\", sc.ScrubString(somevalue))\n }))\napp.Run()\n// Output: scrubbed: my password is *******\n
For anything more complex, it's not practical to call fx.Provide
for every component in a single source file. Fx has two abstraction mechanisms that allow combining lots of options into one app:
fx.Options
simply bundles several Option values into a single Option that can be placed in a variable. As the example in the Fx documentation shows, this is useful to gather the options related to a single Go package, which might include un-exported items, into a single value typically named Module
.fx.Module
is very similar, with two additional features. First, it requires a module name which is used in some Fx logging and can help with debugging. Second, it creates a scope for the effects of fx.Decorate
and fx.Replace
. The second feature is not used in the Agent.So a slightly more complex version of the example might be:
scrubber/component.go main.gofunc Module() fxutil.Module {\n return fx.Module(\"scrubber\",\n fx.Provide(newScrubber)) // now newScrubber need not be exported\n}\n
someValue = \"my password is hunter2\"\napp := fx.New(\n scrubber.Module(),\n fx.Invoke(func(sc scrubber.Component) {\n fmt.Printf(\"scrubbed: %s\", sc.ScrubString(somevalue))\n }))\napp.Run()\n// Output: scrubbed: my password is *******\n
"},{"location":"components/fx/#lifecycle","title":"Lifecycle","text":"Fx provides an fx.Lifecycle
component that allows hooking into application start-up and shut-down. Use it in your component's constructor like this:
func newScrubber(lc fx.Lifecycle) Component {\n sc := &scrubber{..}\n lc.Append(fx.Hook{OnStart: sc.start, OnStop: sc.stop})\n return sc\n}\n\nfunc (sc *scrubber) start(ctx context.Context) error { .. }\nfunc (sc *scrubber) stop(ctx context.Context) error { .. }\n
This separates the application's lifecycle into a few distinct phases:
Fx provides some convenience types to help build constructors that require or provide lots of types: fx.In
and fx.Out
. Both types are embedded in structs, which can then be used as argument and return types for constructors, respectively. By convention, these are named dependencies
and provides
in Agent code:
type dependencies struct {\n fx.In\n\n Config config.Component\n Log log.Component\n Status status.Component\n)\n\ntype provides struct {\n fx.Out\n\n Component\n // ... (we'll see why this is useful below)\n}\n\nfunc newScrubber(deps dependencies) (provides, error) { // can return an fx.Out struct and other types, such as error\n // ..\n return provides {\n Component: scrubber,\n // ..\n }, nil\n}\n
In and Out provide a nice way to summarize and document requirements and provided types, and also allow annotations via Go struct tags. Note that annotations are also possible with fx.Annotate
, but it is much less readable and its use is discouraged.
Value groups make it easier to produce and consume many values of the same type. A component can add any type into groups which can be consumed by other components.
For example:
Here, two components add a server.Endpoint
type to the server
group (note the group
label in the fx.Out
struct).
type provides struct {\n fx.Out\n Component\n Endpoint server.Endpoint `group:\"server\"`\n}\n
type provides struct {\n fx.Out\n Component\n Endpoint server.Endpoint `group:\"server\"`\n}\n
Here, a component requests all the types added to the server
group. This takes the form of a slice received at instantiation (note once again the group
label but in fx.In
struct).
type dependencies struct {\n fx.In\n Endpoints []Endpoint `group:\"server\"`\n}\n
"},{"location":"components/fx/#day-to-day-usage","title":"Day-to-Day Usage","text":"Day-to-day, the Agent's use of Fx is fairly formulaic. Following the component guidelines, or just copying from other components, should be enough to make things work without a deep understanding of Fx's functionality.
"},{"location":"components/migration/","title":"Integrating with other components","text":"After you create your component, you can link it to other components such as flares. (Others, like status pages or health, will come later).
This section documents how to fully integrate your component in the Agent ecosystem.
"},{"location":"components/overview/","title":"Overview of components","text":"The Agent is structured as a collection of components working together. Depending on how the binary is built, and how it is invoked, different components may be instantiated.
"},{"location":"components/overview/#what-is-a-component","title":"What is a component?","text":"The goal of a component is to encapsulate a particular piece of logic/feature and provide a clear and documented interface.
A component must:
Any change within a component that don't change its interface should not require QA of another component using it.
Since each component is an interface to the outside, it can have several implementations.
"},{"location":"components/overview/#fx-vs-go-module","title":"Fx vs Go module","text":"Components are designed to be used with a dependency injection framework. In the Agent, we use Fx, a dependency injection framework, for this. All Agent binaries use Fx to load, coordinate, and start the required components.
Some components are used outside the datadog-agent
repository, where Fx is not available. To support this, the components implementation must not require Fx. Component implementations can be exported as Go modules. The next section explains in more detail how to create components.
The important information here is that it's possible to use components without Fx outside the Agent repository. This comes at the cost of manually doing the work of Fx.
"},{"location":"components/overview/#important-note-on-fx","title":"Important note on Fx","text":"The component framework project's core goal is to improve the Agent codebase by decoupling parts of the code, removing global state and init functions, and increasing reusability by separating logical units into components. Fx itself is not intrinsic to the benefits of componentization.
"},{"location":"components/overview/#next","title":"Next","text":"Next, see how to create a bundle and a component by using Fx.
"},{"location":"components/testing/","title":"Testing components","text":"Testing is an essential part of the software development life cycle. This page covers everything you need to know about testing components.
One of the core benefits of using components is that each component isolates its internal logic behind its interface. Focus on asserting that each implementation behaves correctly.
To recap from the previous page, a component was created that compresses the payload before sending it to the Datadog backend. The component has two separate implementations.
This is the component's interface:
comp/compression/def/component.gotype Component interface {\n // Compress compresses the input data.\n Compress([]byte) ([]byte, error)\n\n // Decompress decompresses the input data.\n Decompress([]byte) ([]byte, error)\n}\n
Ensure the Compress
and Decompress
functions behave correctly.
Writing tests for a component implementation follows the same rules as any other test in a Go project. See the testing package documentation for more information.
For this example, write a test file for the zstd
implementation. Create a new file named component_test.go
in the impl-zstd folder
. Inside the test file, initialize the component's dependencies, create a new component instance, and test the behavior.
All components expect a Requires
struct with all the necessary dependencies. To ensure a component instance can be created, create a requires
instance.
The Requires
struct declares a dependency on the config component and the log component. The following code snippet shows how to create the Require
struct:
package implzstd\n\nimport (\n \"testing\"\n\n configmock \"github.com/DataDog/datadog-agent/comp/core/config/mock\"\n logmock \"github.com/DataDog/datadog-agent/comp/core/log/mock\"\n)\n\nfunc TestCompress(t *testing.T) {\n logComponent := configmock.New(t)\n configComponent := logmock.New(t)\n\n requires := Requires{\n Conf: configComponent,\n Log: logComponent,\n }\n // [...]\n}\n
To create the log and config component, use their respective mocks. The mock package was mentioned previously in the Creating a Component page.
"},{"location":"components/testing/#testing-the-components-interface","title":"Testing the component's interface","text":"Now that the Require
struct is created, an instance of the component can be created and its functionality tested:
package implzstd\n\nimport (\n \"testing\"\n\n configmock \"github.com/DataDog/datadog-agent/comp/core/config/mock\"\n logmock \"github.com/DataDog/datadog-agent/comp/core/log/mock\"\n)\n\nfunc TestCompress(t *testing.T) {\n logComponent := configmock.New(t)\n configComponent := logmock.New(t)\n\n requires := Requires{\n Conf: configComponent,\n Log: logComponent,\n }\n\n provides := NewComponent(requires)\n component := provides.Comp\n\n result, err := component.Compress([]byte(\"Hello World\"))\n assert.Nil(t, err)\n\n assert.Equal(t, ..., result)\n}\n
"},{"location":"components/testing/#testing-lifecycle-hooks","title":"Testing lifecycle hooks","text":"Sometimes a component uses Fx lifecycle to add hooks. It is a good practice to test the hooks as well.
For this example, imagine a component wants to add some hooks into the app lifecycle. Some code is omitted for simplicity:
comp/somecomponent/impl/component.gopackage impl\n\nimport (\n \"context\"\n\n somecomponent \"github.com/DataDog/datadog-agent/comp/somecomponent/def\"\n compdef \"github.com/DataDog/datadog-agent/comp/def\"\n)\n\ntype Requires struct {\n Lc compdef.Lifecycle\n}\n\ntype Provides struct {\n Comp somecomponent.Component\n}\n\ntype component struct {\n started bool\n stopped bool\n}\n\nfunc (c *component) start() error {\n // [...]\n\n c.started = true\n\n return nil\n}\n\nfunc (h *healthprobe) stop() error {\n // [...]\n\n c.stopped = true\n c.started = false\n\n return nil\n}\n\n// NewComponent creates a new healthprobe component\nfunc NewComponent(reqs Requires) (Provides, error) {\n provides := Provides{}\n comp := &component{}\n\n reqs.Lc.Append(compdef.Hook{\n OnStart: func(ctx context.Context) error {\n return comp.start()\n },\n OnStop: func(ctx context.Context) error {\n return comp.stop()\n },\n })\n\n provides.Comp = comp\n return provides, nil\n}\n
The goal is to test that the component updates the started
and stopped
fields.
To accomplish this, create a new lifecycle instance, create a Require
struct instance, initialize the component, and validate that calling Start
on the lifecycle instance calls the component hook and executes the logic.
To create a lifecycle instance, use the helper function compdef.NewTestLifecycle(t *testing.T)
. The function returns a lifecycle wrapper that can be used to populate the Requires
struct. The Start
and Stop
functions can also be called.
Info
You can see the NewTestLifecycle
function here
package impl\n\nimport (\n \"context\"\n \"testing\"\n\n compdef \"github.com/DataDog/datadog-agent/comp/def\"\n \"github.com/stretchr/testify/assert\"\n)\n\nfunc TestStartHook(t *testing.T) {\n lc := compdef.NewTestLifecycle(t)\n\n requires := Requires{\n Lc: lc,\n }\n\n provides, err := NewComponent(requires)\n\n assert.NoError(t, err)\n\n assert.NotNil(t, provides.Comp)\n internalComponent := provides.Comp.(*component)\n\n ctx := context.Background()\n lc.AssertHooksNumber(1)\n assert.NoError(t, lc.Start(ctx))\n\n assert.True(t, internalComponent.started)\n}\n
For this example, a type cast operation had to be performed because the started
field is private. Depending on the component, this may not be necessary.
Using components within other components is covered on the create components page.
Now let's explore how to use components in your binaries. One of the core idea behind component design is to be able to create new binaries for customers by aggregating components.
"},{"location":"components/using-components/#the-cmd-folder","title":"thecmd
folder","text":"All main
functions and binary entry points should be in the cmd
folder.
The cmd
folder uses the following hierarchy:
cmd /\n <binary name> /\n main.go <-- The entry points from your binary\n subcommands / <-- All subcommand for your binary CLI\n <subcommand name> / <-- The code specific to a single subcommand\n command.go\n command_test.go\n
Say you want to add a test
command to the agent
CLI.
You would create the following file:
cmd/agent/subcommands/test/command.gopackage test\n\nimport (\n// [...]\n)\n\n// Commands returns a slice of subcommands for the 'agent' command.\n//\n// The Agent uses \"cobra\" to create its CLI. The command method is your entrypoint. Here, you're going to create a single\n// command.\nfunc Commands(globalParams *command.GlobalParams) []*cobra.Command {\n cmd := &cobra.Command{\n Use: \"test\",\n Short: \"a test command for the Agent\",\n Long: ``,\n RunE: func(_ *cobra.Command, _ []string) error {\n return fxutil.OneShot(\n <callback>,\n <list of dependencies>.\n )\n },\n }\n\n return []*cobra.Command{cmd}\n}\n
The code above creates a test command that does nothing. As you can see, fxutil.OneShot
helpers are being used. These helpers initialize an Fx app with all the wanted dependencies.
The next section explains how to request a dependency.
"},{"location":"components/using-components/#importing-components","title":"Importing components","text":"The fxutil.OneShot
takes a list of components and gives them to Fx. Note that this only tells Fx how to create types when they're needed. This does not do anything else.
For a component to be instantiated, it must be one of the following:
callback
functionfx.Invoke
. More on this on the Fx page.Let's require the log
components:
import (\n // First let's import the FX wrapper to require it\n logfx \"github.com/DataDog/datadog-agent/comp/core/log/fx\"\n // Then the logger interface to use it\n log \"github.com/DataDog/datadog-agent/comp/core/log/def\"\n)\n\n// [...]\n return fxutil.OneShot(\n myTestCallback, // The function to call from fxutil.OneShot\n logfx.Module(), // This will tell FX how to create the `log.Component`\n )\n// [...]\n\nfunc myTestCallback(logger log.Component) {\n logger.Info(\"some message\")\n}\n
"},{"location":"components/using-components/#importing-bundles","title":"Importing bundles","text":"Now let's say you want to include the core bundle instead. The core bundle offers many basic features (logger, config, telemetry, flare, ...).
import (\n // We import the core bundle\n core \"github.com/DataDog/datadog-agent/comp/core\"\n\n // Then the interfaces we want to use\n config \"github.com/DataDog/datadog-agent/comp/core/config/def\"\n)\n\n// [...]\n return fxutil.OneShot(\n myTestCallback, // The function to call from fxutil.OneShot\n core.Bundle(), // This will tell FX how to create the all the components included in the bundle\n )\n// [...]\n\nfunc myTestCallback(conf config.Component) {\n api_key := conf.GetString(\"api_key\")\n\n // [...]\n}\n
It's very important to understand that since myTestCallback
only uses the config.Component
, not all components from the core
bundle are instantiated! The core.Bundle
instructs Fx how to create components, but only the ones required are created.
In our example, the config.Component
might have dozens of dependencies instantiated from the core bundle. Fx handles all of this.
As your migration to components is not finished, you might need to manually instruct Fx on how to use plain types.
You will need to use fx.Supply
for this. More details can be found here.
But here is a quick example:
import (\n logfx \"github.com/DataDog/datadog-agent/comp/core/log/fx\"\n log \"github.com/DataDog/datadog-agent/comp/core/log/def\"\n)\n\n// plain custom type\ntype custom struct {}\n\n// [...]\n return fxutil.OneShot(\n myTestCallback,\n logfx.Module(),\n\n // fx.Supply populates values into Fx. \n // Any time this is needed, Fx will use it.\n fx.Supply(custom{})\n )\n// [...]\n\n// Here our function uses component and non-component type, both provided by Fx.\nfunc myTestCallback(logger log.Component, c custom) {\n logger.Info(\"Custom type: %v\", c)\n}\n
Info
This means that components can depend on plain types too (as long as the main entry point populates Fx options with them).
"},{"location":"components/shared_features/flares/","title":"Flare","text":"The general idea is to register a callback within your component to be called each time a flare is created. This uses Fx groups under the hood, but helpers are there to abstract all the complexity.
Once the callback is created, you will have to migrate the code related to your component from pkg/flare
to your component.
To add data to a flare, you first need to register a callback, also known as a FlareBuilder
.
Within your component, create a method with the following signature: func (c *yourComp) fillFlare(fb flaretypes.FlareBuilder) error
.
This function is called every time the Agent generates a flare\u2014whether from the CLI, RemoteConfig, or from the running Agent. Your callback takes a FlareBuilder as parameter. This object provides all the helpers functions needed to add data to a flare (adding files, copying directories, scrubbing data, and so on).
Example:
import (\n yaml \"gopkg.in/yaml.v2\"\n\n flare \"github.com/DataDog/datadog-agent/comp/core/flare/def\"\n)\n\nfunc (c *myComponent) fillFlare(fb flare.FlareBuilder) error {\n // Creating a new file\n fb.AddFile( \n \"runtime_config_dump.yaml\",\n []byte(\"content of my file\"),\n ) //nolint:errcheck \n\n // Copying a file from the disk into the flare\n fb.CopyFile(\"/etc/datadog-agent/datadog.yaml\") //nolint:errcheck\n return nil\n}\n
Read the FlareBuilder package documentation for more information on the API.
Any errors returned by the FlareBuilder
methods are logged into a file shipped within the flare. This means, in most cases, you can ignore errors returned by the FlareBuilder
methods. In all cases, ship as much data as possible in a flare instead of stopping at the first error.
Returning an error from your callback does not stop the flare from being created or sent. Rather, the error is logged into the flare too.
While it's possible to register multiple callbacks from the same component, try to keep all the flare code in a single callback.
"},{"location":"components/shared_features/flares/#register-your-callback","title":"Register your callback","text":"Now you need to register your callback to be called each time a flare is created. To do this, your component constructor needs to provide a new Provider. Use NewProvider function for this.
Example:
import (\n flare \"github.com/DataDog/datadog-agent/comp/core/flare/def\"\n)\n\ntype Provides struct {\n // [...]\n\n // Declare that your component will return a flare provider\n FlareProvider flare.Provider\n}\n\nfunc newComponent(deps Requires) Provides {\n // [...]\n\n return Provides{\n // [...]\n\n // NewProvider will wrap your callback in order to be use as a 'Provider'\n FlareProvider: flare.NewProvider(myComponent.fillFlare),\n }, nil\n}\n
"},{"location":"components/shared_features/flares/#testing","title":"Testing","text":"The flare component offers a FlareBuilder mock to test your callback.
Example:
import (\n \"testing\"\n \"github.com/DataDog/datadog-agent/comp/core/flare/helpers\"\n)\n\nfunc TestFillFlare(t testing.T) {\n myComp := newComponent(...)\n\n flareBuilderMock := helpers.NewFlareBuilderMock(t)\n\n myComp.fillFlare(flareBuilderMock, false)\n\n flareBuilderMock.AssertFileExists(\"datadog.yaml\")\n flareBuilderMock.AssertFileContent(\"some_file.txt\", \"my content\")\n // ...\n}\n
"},{"location":"components/shared_features/flares/#migrating-your-code","title":"Migrating your code","text":"Now comes the hard part: migrating the code from pkg/flare
related to your component to your new callback.
The good news is that the code in pkg/flare
already uses the FlareBuilder
interface. So you shouldn't need to rewrite any logic. Don't forget to migrate the tests too and expand them (most of the flare features are not tested).
Keep in mind that the goal is to delete pkg/flare
once the migration to component is done.
Components can register a status provider. When the status command is executed, we will populate the information displayed using all the status providers.
"},{"location":"components/shared_features/status/#status-providers","title":"Status Providers","text":"There are two types of status providers: - Header Providers: these providers are displayed at the top of the status output. This section is reserved for the most important information about the agent, such as agent version, hostname, host info, or metadata. - Regular Providers: these providers are rendered after all the header providers.
Each provider has the freedom to configure how they want to display their information for the three types of status output: JSON, Text, and HTML. This flexibility allows you to tailor the output to best suit your component's needs.
The JSON and Text outputs are displayed within the status CLI, while the HTML output is used for the Agent GUI.
To guarantee consistent output, we order the status providers internally. The ordering mechanism is different depending on the status provider. We order the header providers based on an index using the ascending direction. The regular providers are ordered alphabetically based on their names.
"},{"location":"components/shared_features/status/#header-providers-interface","title":"Header Providers Interface","text":"type HeaderProvider interface {\n // Index is used to choose the order in which the header information is displayed.\n Index() int\n // When displaying the Text output the name is render as a header\n Name() string\n JSON(verbose bool, stats map[string]interface{}) error\n Text(verbose bool, buffer io.Writer) error\n HTML(verbose bool, buffer io.Writer) error\n}\n
"},{"location":"components/shared_features/status/#regular-providers-interface","title":"Regular Providers Interface","text":"// Provider interface\ntype Provider interface {\n // Name is used to sort the status providers alphabetically.\n Name() string\n // Section is used to group the status providers.\n // When displaying the Text output the section is render as a header\n Section() string\n JSON(verbose bool, stats map[string]interface{}) error\n Text(verbose bool, buffer io.Writer) error\n HTML(verbose bool, buffer io.Writer) error\n}\n
"},{"location":"components/shared_features/status/#adding-a-status-provider","title":"Adding a status provider","text":"To add a status provider to your component, you need to declare it in the return value of its NewComponent()
function.
The status component provides helper functions to create status providers: NewInformationProvider
and NewHeaderInformationProvider
.
Also, the status component has helper functions to render text and HTML output: RenderText
and RenderHTML.
The signature for both functions is:
(templateFS embed.FS, template string, buffer io.Writer, data any)\n
The embed.FS
variable points to the location of the different status templates. These templates must be inside the component files. The folder must be named status_templates
. The name of the templates do not have any rules, but to keep the same consistency across the code, we suggest using \"<component>.tmpl\"
for the text template and \"<component>HTML.tmpl\"
for the HTML template.
Below is an example of adding a status provider to your component.
comp/compression/impl/compressor.gopackage impl\n\nimport (\n \"fmt\"\n\n compression \"github.com/DataDog/datadog-agent/comp/compression/def\"\n \"github.com/DataDog/datadog-agent/comp/status\"\n)\n\ntype Requires struct {\n}\n\ntype Provides struct {\n Comp compression.Component\n Status status.InformationProvider\n}\n\ntype compressor struct {\n}\n\n// NewComponent returns an implementation for the compression component\nfunc NewComponent(reqs Requires) Provides {\n comp := &compressor{}\n\n return Provides{\n Comp: comp,\n Status: status.NewInformationProvider(&comp)\n }\n}\n\n//\n// Since we are using the compressor as status provider we need to implement the status interface on our component\n//\n\n//go:embed status_templates\nvar templatesFS embed.FS\n\n// Name renders the name\nfunc (c *compressor) Name() string {\n return \"Compression\"\n}\n\n// Index renders the index\nfunc (c *compressor) Section() int {\n return \"Compression\"\n}\n\n// JSON populates the status map\nfunc (c *compressor) JSON(_ bool, stats map[string]interface{}) error {\n c.populateStatus(stats)\n\n return nil\n}\n\n// Text renders the text output\nfunc (c *compressor) Text(_ bool, buffer io.Writer) error {\n return status.RenderText(templatesFS, \"compressor.tmpl\", buffer, c.getStatusInfo())\n}\n\n// HTML renders the html output\nfunc (c *compressor) HTML(_ bool, buffer io.Writer) error {\n return status.RenderHTML(templatesFS, \"compressorHTML.tmpl\", buffer, c.getStatusInfo())\n}\n\nfunc (c *compressor) populateStatus(stats map[string]interface{}) {\n // Here we populate whatever informatiohn we want to display for our component\n stats[\"compressor\"] = ...\n}\n\nfunc (c *compressor) getStatusInfo() map[string]interface{} {\n stats := make(map[string]interface{})\n\n c.populateStatus(stats)\n\n return stats\n}\n
"},{"location":"components/shared_features/status/#testing","title":"Testing","text":"A critical part of your component development is ensuring that the status output is displayed as expected is. We highly encourage you to add tests to your components, giving you the confidence that your status output is accurate and reliable. For our example above, testing the status output is as easy as testing the result of calling JSON
, Text
and HTML
.
package impl\n\nimport (\n \"bytes\"\n \"testing\"\n)\n\nfunc TestText(t *testing.T) {\n requires := Requires{}\n\n provides := NewComponent(requires)\n component := provides.Comp\n buffer := new(bytes.Buffer)\n\n result, err := component.Text(false, buffer)\n assert.Nil(t, err)\n\n assert.Equal(t, ..., string(result))\n}\n\nfunc TestJSON(t *testing.T) {\n requires := Requires{}\n\n provides := NewComponent(requires)\n component := provides.Comp\n info := map[string]interface{}\n\n result, err := component.JSON(false, info)\n assert.Nil(t, err)\n\n assert.Equal(t, ..., result[\"compressor\"])\n}\n
To complete testing, we encourage adding the new status section output as part of the e2e tests. The CLI status e2e tests are in test/new-e2e/tests/agent-subcommands/status
folder.
First of all, thanks for contributing!
This document provides some basic guidelines for contributing to this repository. To propose improvements, feel free to submit a PR.
"},{"location":"guidelines/contributing/#submitting-issues","title":"Submitting issues","text":"Have you fixed a bug or written a new check and want to share it? Many thanks!
In order to ease/speed up our review, here are some items you can check/improve when submitting your PR:
Contributor ChecklistReviewer ChecklistHave a proper commit history (we advise you to rebase if needed) with clear commit messages.
Write tests for the code you wrote.
Preferably make sure that all tests pass locally.
Summarize your PR with an explanatory title and a message describing your changes, cross-referencing any related bugs/PRs.
Use Reno to create a release note.
Open your PR against the main
branch.
Provide adequate QA/testing plan information.
The added code comes with tests.
The CI is green, all tests are passing (required or not).
All applicable labels are set on the PR (see PR labels list).
If applicable, the config template has been updated.
Note
Adding GitHub labels is only possible for contributors with write access.
Your pull request must pass all CI tests before we will merge it. If you're seeing an error and don't think it's your fault, it may not be! Join us on Slack or send us an email, and together we'll get it sorted out.
"},{"location":"guidelines/contributing/#keep-it-small-focused","title":"Keep it small, focused","text":"Avoid changing too many things at once. For instance if you're fixing the NTP check and at the same time shipping a dogstatsd improvement, it makes reviewing harder and the time-to-release longer.
"},{"location":"guidelines/contributing/#commit-messages","title":"Commit Messages","text":"Please don't be this person: git commit -m \"Fixed stuff\"
. Take a moment to write meaningful commit messages.
The commit message should describe the reason for the change and give extra details that will allow someone later on to understand in 5 seconds the thing you've been working on for a day.
This includes editing the commit message generated by GitHub from:
Including new features\n\n* Fix linter\n* WIP\n* Add test for x86\n* Fix licenses\n* Cleanup headers\n
to:
Including new features\n\nThis feature does this and that. Some tests are excluded on x86 because of ...\n
If your commit is only shipping documentation changes or example files, and is a complete no-op for the test suite, please add [skip ci] in the commit message body to skip the build and give that slot to someone else who does need it.
"},{"location":"guidelines/contributing/#pull-request-workflow","title":"Pull request workflow","text":"The goals ordered by priority are:
main
branch, have a meaningful commit history that allows understanding (even years later) what each commit does, and why.You must open the PR when the code is reviewable or you must set the PR as draft if you want to share code before it's ready for actual reviews.
"},{"location":"guidelines/contributing/#before-the-first-pr-review","title":"Before the first PR review","text":"Before the first PR review, meaningful commits are best: logically-encapsulated commits help the reviews go quicker and make the job for the reviewer easier. Conflicts with main
can be resolved with a git rebase origin/main
and a force push if it makes future review(s) easier.
After the first review, to make follow-up reviews easier:
main
using git merge origin/main
main
","text":"Once reviews are complete, the merge to main
should be done with either:
main
clean (even though some context/details are lost in the squash). The commit message for this squash should always be edited to concisely describe the commit without extraneous \u201caddress review comments\u201d text.We use Reno
to create our CHANGELOG. Reno is a pretty simple tool.
Each PR should include a releasenotes
file created with reno
, unless the PR doesn't have any impact on the behavior of the Agent and therefore shouldn't be mentioned in the CHANGELOG (examples: repository documentation updates, changes in code comments). PRs that don't require a release note file will be labeled changelog/no-changelog
by maintainers.
To install reno: pip install reno
Ultra quick Reno
HOWTO:
$> reno new <topic-of-my-pr> --edit\n[...]\n# Remove unused sections and fill the relevant ones.\n# Reno will create a new file in releasenotes/notes.\n#\n# Each section from every release note are combined when the CHANGELOG.rst is\n# rendered. So the text needs to be worded so that it does not depend on any\n# information only available in another section. This may mean repeating some\n# details, but each section must be readable independently of the other.\n#\n# Each section note must be formatted as reStructuredText.\n[...]\n
Then just add and commit the new releasenote (located in releasenotes/notes/
) with your PR. If the change is on the trace-agent
(folders cmd/trace-agent
or pkg/trace
) please prefix the release note with \"APM :\" and the argument with \"apm-\"."},{"location":"guidelines/contributing/#reno-sections","title":"Reno sections","text":"
The main thing to keep in mind is that the CHANGELOG is written for the agent's users and not its developers.
features
: describe shortly what your feature does.
example:
features:\n - |\n Introducing the Datadog Process Agent for Windows.\n
enhancements
: describe enhancements here: new behavior that are too small to be considered a new feature.
example:
enhancements:\n - |\n Windows: Add PDH data to flare.\n
issues
: describe known issues or limitation of the agent.
example:
issues:\n - |\n Kubernetes 1.3 & OpenShift 3.3 are currently not fully supported: docker\n and kubelet integrations work OK, but apiserver communication (event\n collection, `kube_service` tagging) is not implemented\n
upgrade
: List actions to take or limitations that could arise upon upgrading the Agent. Notes here must include steps that users can follow to 1. know if they're affected and 2. handle the change gracefully on their end.
example:
upgrade:\n - |\n If you run a Nomad agent older than 0.6.0, the `nomad_group`\n tag will be absent until you upgrade your orchestrator.\n
deprecations
: List deprecation notes here.
example:
deprecations:\n- |\n Changed the attribute name to enable log collection from YAML configuration\n file from \"log_enabled\" to \"logs_enabled\", \"log_enabled\" is still\n supported.\n
security
: List security fixes, issues, warning or related topics here.
example:
security:\n - |\n The /agent/check-config endpoint has been patched to enforce\n authentication of the caller via a bearer session token.\n
fixes
: List the fixes done in your PR here. Remember to be clear and give a minimum of context so people reading the CHANGELOG understand what the fix is about.
example:
fixes:\n - |\n Fix EC2 tags collection when multiple marketplaces are set.\n
other
: Add here every other information you want in the CHANGELOG that don't feat in any other section. This section should rarely be used.
example:
other:\n - |\n Only enable the ``resources`` metadata collector on Linux by default, to match\n Agent 5's behavior.\n
For internal PRs (from people in the Datadog organization), you have few extra labels that can be use:
community/help-wanted
: for community PRs where help is needed to finish it.community
: for community PRs.changelog/no-changelog
: for PRs that don't require a reno releasenote (useful for PRs only changing documentation or tests).qa/done
or qa/no-code-change
: used to skip the QA week:
qa/done
label is recommended in case of code changes and manual / automated QA done before merge.qa/no-code-change
is recommended if there's no code changes in the Agent binary code.Important
Use qa/no-code-change
if your PR only changes tests or a module/package that does not end up in the Agent build. All of the following do not require QA:
major_change
: to flag the PR as a major change impacting many/all teams working on the agent and will require deeper QA (example: when we change the Python version shipped in the agent).
need-change/operator
, need-change/helm
: indicate that the configuration needs to be modified in the operator / helm chart as well.k8s/<min-version>
: indicate the lowest Kubernetes version compatible with the PR's feature.backport/<branch-name>
: Add this label to automatically create a PR against the <branch-name>
branch with your backported changes. The backport PR creation is triggered:
If there is a conflict, the bot prompts you with a list of instructions to follow (example) to manually backport your PR.
Also called checks, all officially supported Agent integrations live in the integrations-core repo. Please look there to submit related issues, PRs, or review the latest changes. For new integrations, please open a pull request in the integrations-extras repo.
"},{"location":"guidelines/docs/","title":"Writing developer docs","text":"This site is built by MkDocs and uses the Material for MkDocs theme.
You can serve documentation locally with the docs.serve
invoke task.
The site structure is defined by the nav
key in the mkdocs.yml
file.
When adding new pages, first think about what it is exactly that you are trying to document. For example, if you intend to write about something everyone must follow as a standard practice it would be classified as a guideline whereas a short piece about performing a particular task would be a how-to.
After deciding the kind of content, strive to further segment the page under logical groupings for easier navigation.
"},{"location":"guidelines/docs/#line-continuations","title":"Line continuations","text":"For prose where the rendered content should have no line breaks, always keep the Markdown on the same line. This removes the need for any stylistic enforcement and allows for IDEs to intelligently wrap as usual.
Tip
When you wish to force a line continuation but stay within the block, indent by 2 spaces from the start of the text and end the block with a new line. For example, the following shows how you would achieve a multi-line ordered list item:
Markdown1. first line\n\n second line\n\n1. third line\n
Rendered first line
second line
third line
When you want to call something out, use admonitions rather than making large chunks of text bold or italicized. The latter is okay for small spans within sentences.
Here's an example:
Markdown
!!! info\n Lorem ipsum ...\n
Rendered
Info
Lorem ipsum ...
Always use inline links rather than reference links.
The only exception to that rule is links that many pages may need to reference. Such links may be added to this file that all pages are able to reference.
"},{"location":"guidelines/docs/#abbreviations","title":"Abbreviations","text":"Abbreviations like DSD may be added to this file which will make it so that a tooltip will be displayed on hover.
"},{"location":"guidelines/deprecated-components-documentation/defining-apps/","title":"Defining Apps and Binaries","text":""},{"location":"guidelines/deprecated-components-documentation/defining-apps/#binaries","title":"Binaries","text":"Each binary is defined as a main
package in the cmd/
directory, such as cmd/iot-agent
. This top-level package contains only a simple main
function (or often, one for Windows and one for *nix) which performs any platform-specific initialization and then creates and executes a Cobra command.
Consider carefully the tree of Go imports that begins with the main
package. While the Go linker does some removal of unused symbols, the safest means to ensure a particular package isn't occuping space in the resulting binary is to not include it.
A \"simple binary\" here is one that does not have subcommands.
The Cobra configuration for the binary is contained in the command
subpackage of the main package (cmd/<binary>/command
). The main
function calls this package to create the command, and then executes it:
func main() {\n if err := command.MakeCommand().Execute(); err != nil {\n os.Exit(-1)\n }\n}\n
The command.MakeCommand
function creates the *cobra.Command
for the binary, with a RunE
field that defines an app, as described below.
Many binaries have a collection of subcommands, along with some command-line flags defined at the binary level. For example, the agent
binary has subcommands like agent flare
or agent diagnose
and accepts global --cfgfile
and --no-color
arguments.
As with simple binaries, the top-level Cobra command is defined by a MakeCommand
function in cmd/<binary>/command
. This command
package should also define a GlobalParams
struct and a SubcommandFactory
type:
// GlobalParams contains the values of agent-global Cobra flags.\n//\n// A pointer to this type is passed to SubcommandFactory's, but its contents\n// are not valid until Cobra calls the subcommand's Run or RunE function.\ntype GlobalParams struct {\n // ConfFilePath holds the path to the folder containing the configuration\n // file, to allow overrides from the command line\n ConfFilePath string\n\n // ...\n}\n\n// SubcommandFactory is a callable that will return a slice of subcommands.\ntype SubcommandFactory func(globalParams *GlobalParams) []*cobra.Command\n
Each subcommand is implemented in a subpackage of cmd/<binary>/subcommands
, such as cmd/<binary>/subcommands/version
. Each such subpackage contains a command.go
defining a Commands
function that defines the subcommands for that package:
func Commands(globalParams *command.GlobalParams) []*cobra.Command {\n cmd := &cobra.Command { .. }\n return []*cobra.Command{cmd}\n}\n
While Commands
typically returns only one command, it may make sense to return multiple commands when the implementations share substantial amounts of code, such as starting, stopping and restarting a service.
The main
function supplies a slice of subcommand factories to command.MakeCommand
, which calls each one and adds the resulting subcommands to the root command.
subcommandFactories := []command.SubcommandFactory{\n frobnicate.Commands,\n ...,\n}\nif err := command.MakeCommand(subcommandFactories).Execute(); err != nil {\n os.Exit(-1)\n}\n
The GlobalParams
type supports Cobra arguments that are global to all subcommands. It is passed to each subcommand factory so that the defined RunE
callbacks can access these arguments. If the binary has no global command-line arguments, it's OK to omit this type.
func MakeCommand(subcommandFactories []SubcommandFactory) *cobra.Command {\n globalParams := GlobalParams{}\n\n cmd := &cobra.Command{ ... }\n cmd.PersistentFlags().StringVarP(\n &globalParams.ConfFilePath, \"cfgpath\", \"c\", \"\",\n \"path to directory containing datadog.yaml\")\n\n for _, sf := range subcommandFactories {\n subcommands := sf(&globalParams)\n for _, cmd := range subcommands {\n agentCmd.AddCommand(cmd)\n }\n }\n\n return cmd\n}\n
If the available subcommands depend on build flags, move the creation of the subcommand factories to the subcommands/<command>
package and create the slice there using source files with //go:build
directives. Your factory can return nil
if your command is not compatible with the current build flag. In all cases, the subcommands build logic should be constrained to its package. See cmd/agent/subcommands/jmx/command_nojmx.go
for an example.
Apps map directly to fx.App
instances, and as such they define a set of provided components and instantiate some of them.
The fx.App
is always created after Cobra has parsed the command-line, within a cobra.Command#RunE
function. This means that the components supplied to an app, and any BundleParams values, are specific to the invoked command or subcommand.
A one-shot app is one which performs some task and exits, such as agent status
. The pkg/util/fxutil.OneShot
helper function provides a convenient shorthand to run a function only after all components have started. Use it like this:
cmd := cobra.Command{\n Use: \"foo\", ...,\n RunE: func(cmd *cobra.Command, args []string) error {\n return fxutil.OneShot(run,\n fx.Supply(core.BundleParams{}),\n core.Bundle(),\n ..., // any other bundles needed for this app\n )\n },\n}\n\nfunc run(log log.Component) error {\n log.Debug(\"foo invoked!\")\n ...\n}\n
The run
function typically also needs some command-line values. To support this, create a (sub)command-specific cliParams
type containing the required values, and embedding a pointer to GlobalParams:
type cliParams struct {\n *command.GlobalParams\n useTLS bool\n args []string\n}\n
Populate this type within Commands
, supply it as an Fx value, and require that value in the run
function:
func Commands(globalParams *command.GlobalParams) []*cobra.Command {\n cliParams := &cliParams{\n GlobalParams: globalParams,\n }\n var useTLS bool\n cmd := cobra.Command{\n Use: \"foo\", ...,\n RunE: func(cmd *cobra.Command, args []string) error {\n cliParams.args = args\n return fxutil.OneShot(run,\n fx.Supply(cliParams),\n fx.Supply(core.CreateaBundleParams()),\n core.Bundle(),\n ..., // any other bundles needed for this app\n )\n },\n }\n cmd.PersistentFlags().BoolVarP(&cliParams.useTLS, \"usetls\", \"\", \"\", \"force TLS use\")\n\n return []*cobra.Command{cmd}\n}\n\nfunc run(cliParams *cliParams, log log.Component) error {\n if (cliParams.Verbose) {\n log.Info(\"executing foo\")\n }\n ...\n}\n
This example includes cli params drawn from GlobalParams (Verbose
), from subcommand-specific args (useTLS
), and from Cobra (args
).
A daemon app is one that runs \"forever\", such as agent run
. Use the fxutil.Run
helper function for this variety of app:
cmd := cobra.Command{\n Use: \"foo\", ...,\n RunE: func(cmd *cobra.Command, args []string) error {\n return fxutil.Run(\n fx.Supply(core.BundleParams{}),\n core.Bundle(),\n ..., // any other bundles needed for this app\n fx.Supply(foo.BundleParams{}),\n foo.Bundle(), // the bundle implementing this app\n )\n },\n}\n
"},{"location":"guidelines/deprecated-components-documentation/defining-bundles/","title":"Defining Component Bundles","text":"A bundle is defined in a dedicated package named comp/<bundleName>
. The package must have the following defined in bundle.go
:
// team: <teamname>
. This is used to generate CODEOWNERS information.BundleParams
-- the type of the bundle's parameters (see below). This item should have a formulaic doc string like // BundleParams defines the parameters for this bundle.
Bundle
-- an fx.Option
that can be included in an fx.App
to make this bundle's components available. To assist with debugging, use fxutil.Bundle(options...)
. Use fx.Invoke(func(componentpkg.Component) {})
to instantiate components automatically. This item should have a formulaic doc string like // Module defines the fx options for this component.
Typically, a bundle will automatically instantiate the top-level components that represent the bundle's purpose. For example, the trace-agent bundle comp/trace
might automatically instantiate comp/trace/agent
.
You can use the invoke task deva components.new-bundle comp/<bundleName>
to generate a pre-filled bundle.go
file for the given bundle.
Apps can provide some intialization-time parameters to bundles. These parameters are limited to two kinds:
Anything else is runtime configuration and should be handled vi comp/core/config
or another mechanism.
Bundle parameters must stored only Params
types for sub components. The reason is that each sub component must be usable without BundleParams
.
import \".../comp/<bundleName>/foo\"\nimport \".../comp/<bundleName>/bar\"\n// ...\n\n// BundleParams defines the parameters for this bundle.\ntype BundleParams struct {\n Foo foo.Params\n Bar bar.Params\n}\n\nvar Bundle = fxutil.Bundle(\n // You must tell to fx how to get foo.Params from BundleParams.\n fx.Provide(func(params BundleParams) foo.Params { return params.Foo }),\n foo.Module(),\n // You must tell to fx how to get bar.Params from BundleParams.\n fx.Provide(func(params BundleParams) bar.Params { return params.Bar }),\n bar.Module(),\n)\n
"},{"location":"guidelines/deprecated-components-documentation/defining-bundles/#testing","title":"Testing","text":"A bundle should have a test file, bundle_test.go
, to verify the documentation's claim about its dependencies. This simply uses fxutil.TestBundle
to check that all dependencies are satisfied when given the full set of required bundles.
func TestBundleDependencies(t *testing.T) {\n fxutil.TestBundle(t, Bundle)\n}\n
"},{"location":"guidelines/deprecated-components-documentation/purpose/","title":"Purpose of component guidelines","text":"This section describes the mechanics of implementing apps, components, and bundles.
The guidelines are quite prescriptive, with the intent of making all components \"look the same\". This reduces cognitive load when using components -- no need to remember one component's peculiarities. It also allows Agent-wide changes, where we make the same formulaic change to each component. If a situation arises that contradicts the guidelines, then we can update the guidelines (and change all affected components).
"},{"location":"guidelines/deprecated-components-documentation/registrations/","title":"Component Registrations","text":"Components generally need to talk to one another! In simple cases, that occurs by method calls. But in many cases, a single component needs to communicate with a number of other components that all share some characteristics. For example, the comp/core/health
component monitors the health of many other components, and comp/workloadmeta/scheduler
provides workload events to an arbitrary number of subscribers.
The convention in the Agent codebase is to use value groups to accomplish this. The collecting component requires a slice of some collected type, and the providing components provide values of that type. Consider an example case of an HTTP server component to which endpoints can be attached. The server is the collecting component, requiring a slice of type []*endpoint
, where *endpoint
is the collected type. Providing components provide values of type *endpoint
.
The convention is to \"wrap\" the collected type in a Registration
struct type which embeds fx.Out
and has tag group:\"pkgname\"
, where pkgname
is the short package name (Fx requires a group name, and this is as good as any). This helps providing components avoid the common mistake of omitting the tag. Because it is wrapped in an exported Registration
type, the collected type can be an unexported type, as in the example below.
The collecting component should define the registration type and a constructor for it:
comp/server/component.go// ...\n// Server endpoints are provided by other components, by providing a server.Registration\n// instance.\n// ...\npackage server\n\ntype endpoint struct { // (the collected type)\n ...\n}\n\ntype Registration struct {\n fx.Out\n\n Endpoint endpoint `group:\"server\"`\n}\n\n// NewRegistration creates a new Registration instance for the given endpoint.\nfunc NewRegistration(route string, handler func()) Registration { ... }\n
Its implementation then requires a slice of the collected type (endpoint
), again using group:\"server\"
:
// endpoint defines an endpoint on this server.\ntype endpoint struct { ... }\n\ntype dependencies struct {\n fx.In\n\n Registrations []endpoint `group:\"server\"`\n}\n\nfunc newServer(deps dependencies) Component {\n // ...\n for _, e := range deps.Registrations {\n if e.handler == nil {\n continue\n }\n // ...\n }\n // ...\n}\n
It's good practice to ignore zero values, as that allows providing components to skip the registration if desired.
Finally, the providing component (in this case, foo
) includes a registration in its output as an additional provided type, beyond its Component
type:
func newFoo(deps dependencies) (Component, server.Registration) {\n // ...\n return foo, server.NewRegistration(\"/things/foo\", foo.handler)\n}\n
This technique has some caveats to be aware of:
Component
type is required. This may lead to components being instantiated in unexpected circumstances.Subscriptions are a common form of registration, and have support in the pkg/util/subscriptions
package.
In defining subscriptions, the component that transmits messages is the collecting component, and the processes receiving components are the providing components. These are matched using the message type, which must be unique across the codebase, and should not be a built-in type like string
. Providing components provide a subscriptions.Receiver[coll.Message]
which has a Ch
channel from which to receive messages. Collecting components require a subscriptions.Transmitter[coll.Message]
which has a Notify
method to send messages.
// ...\n// To subscribe to these announcements, provide a subscriptions.Subscription[announcer.Announcement].\n// ...\npackage announcer\n
func newAnnouncer(tx subscriptions.Transmitter[Anouncement]) Component {\n return &announcer{announcementTx: tx} // (store the transmitter)\n}\n\n// ... later send messages with\nfunc (ann *announcer) announce(a announcement) {\n ann.annoucementTx.Notify(a)\n}\n
func newListener() (Component, subscriptions.Receiver[announcer.Announcement]) {\n rx := subscriptions.Receiver[Event]() // create a receiver\n return &listener{announcementRx: rx}, rx // capture the receiver _and_ return it\n}\n\n// ... later receive messages (usually in an actor's main loop)\nfunc (l *listener) run() {\n loop {\n select {\n case a := <- l.announcementRx.Ch:\n ...\n }\n }\n}\n
Any component receiving messages via a subscription will automatically be instantiated by Fx if it is delcared in the app, regardless of whether its Component type is required by some other component. The workaround for this is to return a zero-valued Receiver when the component does not actually wish to receive messages (such as when the component is disabled by user configuration).
If a receiving component does not subscribe (for example, if it is not started), it can return the zero value, subscriptions.Receiver[Event]{}
, from its constructor. If a component returns a non-nil subscriber, it must consume messages from the receiver or risk blocking the transmitter.
See the pkg/util/subscriptions
documentation for more details.
Component dependencies are automatically determined from the arguments to a component constructor. Most components have a few dependencies, and use a struct named dependencies
to represent them:
type dependencies struct {\n fx.In\n\n Lc fx.Lifecycle\n Params internal.BundleParams\n Config config.Module\n Log log.Module\n // ...\n}\n\nfunc newThing(deps dependencies) Component {\n t := &thing{\n log: deps.Log,\n ...\n }\n deps.Lc.Append(fx.Hook{OnStart: t.start})\n return t\n}\n
"},{"location":"guidelines/deprecated-components-documentation/using-components/#testing","title":"Testing","text":"Testing for a component should use fxtest
to create the component. This focuses testing on the API surface of the component against which other components will be built. Per-function unit tests are, of course, also great where appropriate!
Here's an example testing a component with a mocked dependency on other
:
func TestMyComponent(t *testing.T) {\n var comp Component\n var other other.Component\n app := fxtest.New(t,\n Module, // use the real version of this component\n other.MockModule(), // use the mock version of other\n fx.Populate(&comp), // get the instance of this component\n fx.Populate(&other), // get the (mock) instance of the other component\n )\n\n // start and, at completion of the test, stop the components\n defer app.RequireStart().RequireStop()\n\n // cast `other` to its mock interface to call mock-specific methods on it\n other.(other.Mock).SetSomeValue(10) // Arrange\n comp.DoTheThing() // Act\n require.Equal(t, 20, other.(other.Mock).GetSomeResult()) // Assert\n}\n
If the component has a mock implementation, it is a good idea to test that mock implementation as well.
"},{"location":"hostname/hostname_force_config_as_canonical/","title":"Config-provided hostname starting withip-
or domu
","text":""},{"location":"hostname/hostname_force_config_as_canonical/#description-of-the-issue","title":"Description of the issue","text":"In v6 and v7 Agents, if hostname
is set in datadog.yaml
(or through the DD_HOSTNAME
env var) and its value starts with ip-
or domu
, the hostname is not used in-app as the canonical hostname, even if it is a valid hostname. More information about what a canonical hostname is can be found at How does Datadog determine the Agent hostname?.
To know if your Agents are affected, starting with v6.16.0 and v7.16.0, the Agent logs the following warning if it detects a situation where the config-provided hostname is a valid hostname but will not be accepted as the canonical hostname in-app: Hostname '<HOSTNAME>' defined in configuration are not used as the in-app hostname. For more information: https://dtdg.co/agent-hostname-force-config-as-canonical
If this warning is logged, you have the following options:
hostname
from datadog.yaml
(or the DD_HOSTNAME
env var) and restart the Agent; orip-
or domu
","text":"Starting with Agent v6.16.0 and v7.16.0, the Agent supports the config option hostname_force_config_as_canonical
(default: false
). When set to true
, a configuration-provided hostname starting with ip-
or domu
is accepted as the canonical hostname in-app:
The repository contains a few submodules. To add a new one and ensure it is tested, follow the following steps:
Create a directory for the module:
cd ~/my_path_to/datadog-agent && mkdir mymodule\n
Initialize a new Go module:
cd path/to/mymodule && go mod init\n
Create a dummy root package file doc.go
:
cat >doc.go <<EOL\n// Unless explicitly stated otherwise all files in this repository are licensed\n// under the Apache License Version 2.0.\n// This product includes software developed at Datadog (https://www.datadoghq.com/).\n// Copyright 2016-present Datadog, Inc.\npackage mymodule\nEOL\n
Udpate the modules.yml
file at the root of the repository with this content:
path/to/mymodule:\n independent: true\n should_tag: false\n test_targets:\n - .\n
independent
: Should it be importable as an independent module?should_tag
: Should the Agent pipeline tag it?test_targets
: Should go test
target specific subfolders?If you use your module in another module within datadog-agent
, add the require
and replace
directives in go.mod
.
From the other module root, install the dependency with go get
:
go get github.com/DataDog/datadog-agent/path/to/mymodule\n
Then add the replace directive in the go.mod
file: module github.com/DataDog/datadog-agent/myothermodule\ngo 1.18\n// Replace with local version\nreplace github.com/DataDog/datadog-agent/path/to/mymodule => ../path/to/mymodule\nrequire (\n github.com/DataDog/datadog-agent/path/to/mymodule v0.0.0-20230526143644-ed785d3a20d5\n)\n
Example PR: #17350 virtualenv $GOPATH/src/github.com/DataDog/datadog-agent/venv
If using virtual environments when running the built Agent, you may need to override the built Agent's search path for Python check packages using the PYTHONPATH
variable (your target path must have the pre-requisite core integration packages installed though).
PYTHONPATH="./venv/lib/python3.11/site-packages:$PYTHONPATH" ./agent run ...
See also some notes in ./checks about running custom python checks.
Our invoke tasks are only compatible with Python 3, thus you will need to use Python 3 to run them.
Though you may install invoke in a variety of way we suggest you use the provided requirements file and pip
:
pip install -r tasks/requirements.txt
-
This procedure ensures you not only get the correct version of invoke
, but also any additional python dependencies our development workflow may require, at their expected versions. It will also pull other handy development tools/deps (reno
, or docker
).
You must install Golang version 1.22.8
or higher. Make sure that $GOPATH/bin
is in your $PATH
otherwise invoke
cannot use any additional tool it might need.
Note
Versions of Golang that aren't an exact match to the version specified in our build images (see e.g. here) may not be able to build the agent and/or the rtloader binary properly.
From the root of datadog-agent
, run invoke install-tools
to install go tooling. This uses go
to install the necessary dependencies.
When working on the Agent codebase you can choose among two different ways to build the binary, informally named System and Embedded builds. For most contribution scenarios you should rely on the System build (the default) and use the Embedded one only for specific use cases. Let's explore the differences.
System builds use your operating system's standard system libraries to satisfy the Agent's external dependencies. Since, for example, macOS 10.11 may provide a different version of Python than macOS 10.12, system builds on each of these platforms may produce different Agent binaries. If this doesn't matter to you—perhaps you just want to contribute a quick bugfix—do a System build; it's easier and faster than an Embedded build. System build is the default for all build and test tasks, so you don't need to configure anything there. But to make sure you have system copies of all the Agent's dependencies, skip the Embedded build section below and read on to see how to install them via your usual package manager (apt, yum, brew, etc).
Embedded builds download specifically-versioned dependencies and compile them locally from sources. We run Embedded builds to create Datadog's official Agent releases (i.e. RPMs, debs, etc), and while you can run the same builds while developing locally, the process is as slow as it sounds. Hence, you should only use them when you care about reproducible builds. For example:
Embedded builds rely on Omnibus to download and build dependencies, so you need a recent ruby
environment with bundler
installed. See how to build Agent packages with Omnibus for more details.
The agent is able to collect systemd journal logs using a wrapper on the systemd utility library.
On Ubuntu/Debian:
sudo apt-get install libsystemd-dev
+
This procedure ensures you not only get the correct version of invoke
, but also any additional python dependencies our development workflow may require, at their expected versions. It will also pull other handy development tools/deps (reno
, or docker
).
You must install Golang version 1.23.3
or later. Make sure that $GOPATH/bin
is in your $PATH
otherwise invoke
cannot use any additional tool it might need.
Note
Versions of Golang that aren't an exact match to the version specified in our build images (see e.g. here) may not be able to build the agent and/or the rtloader binary properly.
From the root of datadog-agent
, run invoke install-tools
to install go tooling. This uses go
to install the necessary dependencies.
When working on the Agent codebase you can choose among two different ways to build the binary, informally named System and Embedded builds. For most contribution scenarios you should rely on the System build (the default) and use the Embedded one only for specific use cases. Let's explore the differences.
System builds use your operating system's standard system libraries to satisfy the Agent's external dependencies. Since, for example, macOS 10.11 may provide a different version of Python than macOS 10.12, system builds on each of these platforms may produce different Agent binaries. If this doesn't matter to you—perhaps you just want to contribute a quick bugfix—do a System build; it's easier and faster than an Embedded build. System build is the default for all build and test tasks, so you don't need to configure anything there. But to make sure you have system copies of all the Agent's dependencies, skip the Embedded build section below and read on to see how to install them via your usual package manager (apt, yum, brew, etc).
Embedded builds download specifically-versioned dependencies and compile them locally from sources. We run Embedded builds to create Datadog's official Agent releases (i.e. RPMs, debs, etc), and while you can run the same builds while developing locally, the process is as slow as it sounds. Hence, you should only use them when you care about reproducible builds. For example:
Embedded builds rely on Omnibus to download and build dependencies, so you need a recent ruby
environment with bundler
installed. See how to build Agent packages with Omnibus for more details.
The agent is able to collect systemd journal logs using a wrapper on the systemd utility library.
On Ubuntu/Debian:
sudo apt-get install libsystemd-dev
On Redhat/CentOS:
sudo yum install systemd-devel
If you want to build a Docker image containing the Agent, or if you wan to run system and integration tests you need to run a recent version of Docker in your dev environment.
We use Doxygen to generate the documentation for the rtloader
part of the Agent.
To generate it (using the invoke rtloader.generate-doc
command), you'll need to have Doxygen installed on your system and available in your $PATH
. You can compile and install Doxygen from source with the instructions available here. Alternatively, you can use already-compiled Doxygen binaries from here.
To get the dependency graphs, you may also need to install the dot
executable from graphviz and add it to your $PATH
.
It is optional but recommended to install pre-commit
to run a number of checks done by the CI locally.
To install it, run:
python3 -m pip install pre-commit
pre-commit install
The shellcheck
pre-commit hook requires having the shellcheck
binary installed and in your $PATH
. To install it, run:
deva install-shellcheck --destination <path>
(by default, the shellcheck binary is installed in /usr/local/bin
).
pre-commit
¶If you want to skip pre-commit
for a specific commit you can add --no-verify
to the git commit
command.
pre-commit
manually¶If you want to run one of the checks manually, you can run pre-commit run <check name>
.
You can run it on all files with the --all-files
flag.
pre-commit run flake8 --all-files # run flake8 on all files
-
See pre-commit run --help
for further options.
Microsoft Visual Studio Code with the devcontainer plugin allow to use a container as remote development environment in vscode. It simplify and isolate the dependencies needed to develop in this repository.
To configure the vscode editor to use a container as remote development environment you need to:
deva vscode.setup-devcontainer --image "<image name>"
. This command will create the devcontainer configuration file ./devcontainer/devcontainer.json
.Microsoft Visual Studio Code is recommended as it's lightweight and versatile.
Building on Windows requires multiple 3rd-party software to be installed. To avoid the complexity, Datadog recommends to make the code change in VS Code, and then do the build in Docker image. For complete information, see Build the Agent packages