Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telemetry2 thread info #9084

Closed
wants to merge 6 commits into from

Conversation

jsarha
Copy link
Contributor

@jsarha jsarha commented Apr 26, 2024

No description provided.

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff, will help with debug, I expect we will also be able to call these APIs via shell too.

src/debug/telemetry/telemetry2_slot.c Outdated Show resolved Hide resolved
src/debug/telemetry/telemetry2_slot.c Outdated Show resolved Hide resolved
src/debug/telemetry/telemetry2_slot.c Outdated Show resolved Hide resolved
src/debug/telemetry/telemetry2_slot.c Outdated Show resolved Hide resolved
src/debug/telemetry/telemetry2_slot.c Show resolved Hide resolved
src/debug/telemetry/telemetry2_thread_info.c Outdated Show resolved Hide resolved
src/debug/telemetry/telemetry2_thread_info.c Outdated Show resolved Hide resolved
src/debug/telemetry/telemetry2_thread_info.c Show resolved Hide resolved
return 0;
}

static uint8_t thread_info_cpu_utilization(struct k_thread *thread,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like some of this file is RTOS specific and should go in Zephyr as Zephyr APIs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its just refining the Zephyr thread stats cycle-counter values to produce load information. It requires cached values from previous round, and a generic Zephyr size implementation would become quite complex, requiring caller specific context allocation etc. If there is time for it, I can do it of course, but AFAIK we do not have it ATM.

tools/telemetry2/thread-analyzer.py Outdated Show resolved Hide resolved
@jsarha
Copy link
Contributor Author

jsarha commented Apr 29, 2024

One major change idea for this would be making each core data to have its own chunk. In the end it would same couple of bytes and maybe look a bit cleaner. A down side would be the core specific chunks be there in the telemetry2 slot in 'random' order depending on which cores are started up first.

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lots of unclear parts.

If this is a Zephyr-based telemetry, then why not call it telemetry-zephyr instead of a telemetry2 which means someone will suggest a telemetry3 at some point...

@@ -9,3 +9,8 @@ config SOF_TELEMETRY
systick_info measurement which measures scheduler task performance and may
slightly affect overall performance.

config SOF_TELEMETRY2_SLOT
bool "Enable SOF telemetry2 slot in debug window"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what on earth is telemetry2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @lgirdwood 's original plan was to move the current telemetry node under a chunk in this new system, e.g. not to waste full 4k just for the current telemetry structure. That of course would need agreement from Windows side and some coordination, so for the time being I just decided to call it telemetry2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The precisions are even less clear than the original, sorry. you're drowning me in details I didn't even know I didn't need :-)

src/debug/telemetry/telemetry2_slot.c Outdated Show resolved Hide resolved
src/debug/telemetry/telemetry2_slot.c Outdated Show resolved Hide resolved
src/debug/telemetry/telemetry2_slot.c Outdated Show resolved Hide resolved
src/debug/telemetry/telemetry2_slot.c Outdated Show resolved Hide resolved
src/debug/telemetry/telemetry2_thread_info.c Outdated Show resolved Hide resolved
src/debug/telemetry/telemetry2_thread_info.c Outdated Show resolved Hide resolved

struct thread_info_chunk {
struct telemetry2_chunk_hdr hdr;
uint16_t core_count;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a thread can run on multiple cores?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the header of the whole thread info chunk, and the number indicates how many cores we have. The code keeps all the core separate all the time. Nothing else seems to work (DSP hangs). So there is a thread running on each core reporting the threads on that core. This header is just written once in the beginning and is used by the user-space client to find the core specific sections. I'll add some more comments.

tools/telemetry2/thread-info.py Show resolved Hide resolved
tools/telemetry2/thread-info.py Outdated Show resolved Hide resolved
@jsarha
Copy link
Contributor Author

jsarha commented May 2, 2024

Applied all pending comments, but the big question about the new telemetry slots name is still unanswered. What about telemetry_v2, that would indicate that telemetry_v2 is trying to be a super-set of telemetry (v1) and while it can coexist with it, it can also make it obsolete. There is probably also still more comments to be written.

@plbossart
Copy link
Member

telemetry_v2 is trying to be a super-set of telemetry (v1) and while it can coexist with it, it can also make it obsolete.

That really requires explanations !

@lgirdwood
Copy link
Member

telemetry_v2 is trying to be a super-set of telemetry (v1) and while it can coexist with it, it can also make it obsolete.

That really requires explanations !

Intention is to be able to see Zephyr RTOS resources in realtime, not break backwards compatibility with existing telemetry and grow alongside Zephyr. Today only 1 backend, the mem Window, but should also work with probes in future.

V2 naming is really about the ABI, i.e. the feature supports existing "v1" telemetry for compatibility (do not break and can embed) but v2 is to exploit all the data that is available via Zephyr RTOS APIs around scheduling, memory, IRQs, locking, MMU, PM etc.

Expectation is that Zephyr data will grow/change over time hence an IPC based data format is used that will allow ABI growth and wont break tooling.

@jsarha
Copy link
Contributor Author

jsarha commented May 7, 2024

Made some necessary updated due to changes to my PR on Zephyr side (zephyrproject-rtos/zephyr#71408) and while at it, added some text to telemetry2 (I did not change the name just yet) commit message and associated Kconfig help section and header. Also added a commit to increase debug window size from 12k to 16k.

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't regularly review drafts, but some drive-by comments given we have some shared problems with dw-slot usage. No blockers in the PR really. If we want to enable this by default, some review/scrutiny on the perf impact might be needed.

src/debug/telemetry/Kconfig Show resolved Hide resolved
src/debug/telemetry/telemetry2_thread_info.c Outdated Show resolved Hide resolved
@@ -79,7 +79,7 @@ CONFIG_INTEL_ADSP_IPC=y
CONFIG_WATCHDOG=y
CONFIG_LL_WATCHDOG=y

CONFIG_MEMORY_WIN_2_SIZE=12288
CONFIG_MEMORY_WIN_2_SIZE=16384
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how we can make this scale better. Did @marcinszkudlinski you have ideas on this, I think you hit this with tracing backend as well.

I now pushed first PR for shell support zephyrproject-rtos/zephyr#72738 . I was contemplating on adding it to a new slot (and increase WIN_2 size), but:

  • shell won't be enable in all builds (just like tracing and probably this telemetry2 as well)
  • having to touch the CONFIG_MEMORY_WIN_2_SIZE requires touching multiple board specific files, so enabling the feature with a simpler overlay no longer works

No better solution yet. For shell, I now reuse the dw-slot for mtrace and force exclusivity between shell and mtrace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets hard code today and we cant print it in teh tooling "I am using LNL so using hard codec value of X", in the future we can have a more dynamic approach.

@jsarha
Copy link
Contributor Author

jsarha commented May 22, 2024

Update the implementation to use the latest API update for this PR: zephyrproject-rtos/zephyr#71408

Also some updates to code comments. Got rid of C++ //-comments.

Jyri Sarha added 6 commits May 24, 2024 16:54
TELEMETRY2 is a generic memory structure for passing real-time debug
information from the SOF system. It only provides the framework for
passing pieces of abstract data from DSP side to host side debugging
tools.

This commit implements telemetry2 data passing from DSP to host side
using debug memory window. The commit defines telemetry2 payload
header and telemetry2 chunk header. Adds function to search, or if not
found reserve, a chunk of specified size and id from telemetry2
slot. Each chunk is forced to star from 64-byte cache line boundary. A
pointer to the chunk is returned.

Signed-off-by: Jyri Sarha <jyri.sarha@linux.intel.com>
Adds code for collecting thread statistic for cpu and stack usage, and
for writing that information to telemetry2 memory chuck.

Signed-off-by: Jyri Sarha <jyri.sarha@linux.intel.com>
Simple code to open telemetry2 debugfs file, check telemetry2 payload
header, look for thread info chunk, decode it periodically, and print
contents to stdout.

Without any arguments this the script opens and polls telemetry2
debugfs file contents. The DSP should be up and running For there to
be any data to be decoded. When DSP up, the telemetry2 thread info
support is compiled into the FW, and the SOF Linux driver supports
telemetry2 slot mapping to debugfs, the output should look like this:

core_count: 3
Core 0 load 1%
        0x401184a8           cpu 0.0% stack 2.0%
        0x401183e0           cpu 0.0% stack 2.0%
        1 thread info        cpu 0.0% stack 8.2%
        0 thread info        cpu 0.0% stack 47.1%
        edf_workq            cpu 0.0% stack 10.2%
        sysworkq             cpu 0.0% stack 31.8%
        logging              cpu 0.8% stack 19.6%
        idle 00              cpu 98.0% stack 33.7%
Core 2 load 6%
        ll_thread2           cpu 0.0% stack 18.8%
        idle 02              cpu 93.3% stack 27.1%
        0x40118570           cpu 0.4% stack 47.1%
        2 thread info        cpu 0.0% stack 47.1%

Signed-off-by: Jyri Sarha <jyri.sarha@linux.intel.com>
Increase Debug Window (Memory window 2) size by one slot (4096 bytes).
The extra space is needed for telemetry2 slot. See SOF_TELEMETRY2_SLOT
option for details.

Signed-off-by: Jyri Sarha <jyri.sarha@linux.intel.com>
@jsarha
Copy link
Contributor Author

jsarha commented May 24, 2024

Relax locking for telemetry2 chunk reservation. Fix some minor style issues.

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting better.

uint32_t hdr_size; // size of this header only in bytes
uint32_t total_size; // total size of the whole payload in bytes
uint32_t abi; // ABI version, can be used by tool to warn users if too old.
uint64_t tstamp; // aligned with host epoch.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should same epoch as logging.

#define THREAD_INFO_MAX_THREADS 16

struct thread_info {
char name[14];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a Zephyr define for this ? - can we reuse it ?

char name[14];
uint8_t stack_usage; /* precentage */
uint8_t cpu_usage; /* precentage */
} __packed;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, but wont that need cache line alignment ?

@@ -65,6 +65,7 @@ static struct previous_counters { // Cached data from previous round
* k_thread_foreach_current_cpu().
*/
struct user_data {
int core;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data in commit message looks good, but I would put the busy/idle % on the top row (ranking on % busy stack can come later).

import ctypes
import time
import sys

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we comment this file more, i.e. describe what the macros and code blocks do.

@@ -79,7 +79,7 @@ CONFIG_INTEL_ADSP_IPC=y
CONFIG_WATCHDOG=y
CONFIG_LL_WATCHDOG=y

CONFIG_MEMORY_WIN_2_SIZE=12288
CONFIG_MEMORY_WIN_2_SIZE=16384
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets hard code today and we cant print it in teh tooling "I am using LNL so using hard codec value of X", in the future we can have a more dynamic approach.

@@ -54,6 +54,9 @@

#include <stdint.h>

// HACK: This should be in zephyr/soc/intel/intel_adsp/common/include/adsp_debug_window.h
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if the payload is application specific we should be in SOF header, but RTOS specific in Zephyr header - if it s both then we cant keep in in application header.

@lgirdwood lgirdwood added this to the v2.11 milestone Jun 18, 2024
@lgirdwood
Copy link
Member

lgirdwood commented Jul 5, 2024

@jsarha any updates ? I think we should have all Zephyr commits upstream now ? Can you rebase too.

@jsarha
Copy link
Contributor Author

jsarha commented Jul 8, 2024

@lgirdwood I am currently doing (or actually re-spinning after two week bug-hunt break) the major rework to change to debug window protocol to core specific circular buffers, you requested. I do not expect it to be ready before end of the week, when I start my two week summer vacation, but I hope it does not take more than another week after that.

@kv2019i
Copy link
Collaborator

kv2019i commented Sep 9, 2024

@jsarha This can be closed now with #9398 merged ?

@kv2019i kv2019i removed this from the v2.11 milestone Sep 10, 2024
@jsarha
Copy link
Contributor Author

jsarha commented Sep 13, 2024

Yes. I'll close it now.

@jsarha jsarha closed this Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants