[host receiver]: cpu reporting metrics #1534

MovieStoreGuy · 2024-10-29T07:05:38Z

Component(s)

receiver/hostmetrics

Is your feature request related to a problem? Please describe.

Currently the host cpu scraper reports the following optional metrics:

system.cpu.logical.count - combined total of threads available
system.cpu.physical.count - total number of cores available

If you consider the scenario that you are using a system that is a dual socket (or greater) motherboard, this could be somewhat confusing since there is no way to currently count the number of cpu sockets available.

Describe the solution you'd like

The ideal scenarios are:

Improve the documentation around what each metrics is actually capturing with relation to threads and cores
Add a metric to show the number of installed CPU
Update the metric names to be system.cpu.cores.count, system.cpu.threads.count, and system.cpu.count to be clear on what they represent.

Describe alternatives you've considered

I don't think it is common to be running machines that have multiple CPU sockets, so I suspect why this hasn't come up sooner, however, in the current state of things. The other option would be to include a resource tag of the cpu socket name.
(Trying not to confuse myself with cpu, core, and thread)

Additional context

No response

The text was updated successfully, but these errors were encountered:

github-actions · 2024-10-29T07:05:56Z

Pinging code owners:

receiver/hostmetrics: @dmitryax @braydonk

See Adding Labels via Comments if you do not have permissions to add labels yourself.

ChrsMark · 2024-10-30T12:19:28Z

Since those are already part of Semantic Conventions I guess it should be discussed at the Sem Conv level first?

/cc @open-telemetry/semconv-system-approvers

braydonk · 2024-10-30T14:57:44Z

I agree generally with system.cpu.cores.count, system.cpu.threads.count, and system.cpu.count. I find that simpler to understand and is more directly related to how the system reports this info to you (see lscpu on Linux for example).

Based on these SemConv naming rules, I think the metric names should be pluralized, since cores, threads, and physical CPUs are a countable quantity and are not UpDownCounters, i.e. system.cpu.cores, system.cpu.threads. system.cpus is a very weird metric name though, so I'm not sure. I will ask what semconv maintainers think, I don't have an opinion other than cpus just looks weird.

In addition, perhaps a system.cpu.sockets metric would be a good idea too. This is a similarly easy piece of info to get from the system.

braydonk · 2024-10-30T18:33:53Z

Addendum to what I said above.

The CPU count can change on the same host resource (as last came up when we were talking about whether these metrics should be a resource attribute or a timeseries) so these might actually be instrumented as UpDownCounters after all. To summarize for either direction.

Additionally, to get around my cpus weirdness, we could keep the logical vs physical verbiage.

If this is considered a countable entity that falls under pluralization we should do:

system.cpu.physical.cores
system.cpu.logical.cores
system.cpu.threads
system.cpu.sockets

Or if these are UpDownCounters/gauges:

system.cpu.physical.core.count
system.cpu.logical.core.count
system.cpu.thread.count
system.cpu.socket.count

system.cpu.*.core.count is quite wordy though, so we could also go with what the issue originally suggest, system.cpu.count and system.cpu.core.count. My instinct is that I like the physical and logical verbiage anyway (though we definitely should improve our documentation on what that exactly means) but they are quite wordy with .count at the end.

I think the end state for these metrics is for them to become attributes on the eventual host entity. However, it would be really nice to get this defined well enough that we could adopt it in the collector before entities make it there; right now the original issue is correct that there is a usability problem that I'd like to get a fix for without waiting for Entities to standardize. If anyone has pointers for how these metrics can be designed such that it wouldn't be hard to transition them to entity attributes in the future.

trask · 2024-10-30T18:58:20Z

here's a prior art just in case that helps: jvm.cpu.count

braydonk · 2024-10-30T19:28:53Z

Thanks @trask that's good to know. That precedent is probably enough to say that this should be an UpDownCounter timeseries. I was waffling between whether this should be an UpDownCounter or a Gauge but if a similar metric like this is already an UpDownCounter then that may be enough. I also figured there was a potential use case where an operator may want to aggregate a physical CPU count metric among a pool of VMs to maybe aggregate cost, track CPU cores against some platform quota, or something like that. Kind of a niche case, but just because I was able to think of some manner of use case it seemed like it might be enough to call it an UpDownCounter as well.

MovieStoreGuy added the enhancement New feature or request label Oct 29, 2024

mx-psi transferred this issue from open-telemetry/opentelemetry-collector-contrib Oct 30, 2024

mx-psi added the system-semconv-wg-ga-blocker label Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[host receiver]: cpu reporting metrics #1534

[host receiver]: cpu reporting metrics #1534

MovieStoreGuy commented Oct 29, 2024

github-actions bot commented Oct 29, 2024

ChrsMark commented Oct 30, 2024

braydonk commented Oct 30, 2024 •

edited

Loading

braydonk commented Oct 30, 2024

trask commented Oct 30, 2024

braydonk commented Oct 30, 2024

[host receiver]: cpu reporting metrics #1534

[host receiver]: cpu reporting metrics #1534

Comments

MovieStoreGuy commented Oct 29, 2024

Component(s)

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

github-actions bot commented Oct 29, 2024

ChrsMark commented Oct 30, 2024

braydonk commented Oct 30, 2024 • edited Loading

braydonk commented Oct 30, 2024

trask commented Oct 30, 2024

braydonk commented Oct 30, 2024

braydonk commented Oct 30, 2024 •

edited

Loading