-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I/O Performance Measurement: Add I/O Performance Monitor #9330
I/O Performance Measurement: Add I/O Performance Monitor #9330
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments from me.
return; | ||
|
||
struct io_perf_monitor_ctx *self = &perf_monitor_ctx; | ||
k_spinlock_key_t key = k_spin_lock(&self->lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this, this lock was there in reference (not sure what kind of a lock it was), but I think this has potential to disrupt the measured interfaces with a spinlock here. The entries to which we write are just statically defined memory, assigned to a module from performance monitor, and released by the module itself, so I don't think the global lock is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI answered my own question, smokes did not pass. Both this and release slot function cannot be locked. io_perf_monitor_release_slot has locks in the bitmap functions, so it should be safe.
48f43da
to
55dab6c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly minor opens.
struct io_perf_monitor_ctx { | ||
struct k_spinlock lock; | ||
enum ipc4_perf_measurements_state_set state; | ||
struct perf_bitmap io_performance_data_bitmap;//?????? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need to explain this i.e. its it unused but needed for backwards compat we should say so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I do use it. It may be unnecessary though, as this only points to static memory.
int io_perf_monitor_init_data(struct io_perf_data_item **slot_id, | ||
struct io_perf_data_item *init_data) | ||
{ | ||
if (!slot_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put a log() here since its coming from host SW.
struct io_perf_monitor_ctx *io_perf_monitor = &perf_monitor_ctx; | ||
struct io_perf_data_item *new_slot = io_perf_monitor_get_next_slot(io_perf_monitor); | ||
|
||
if (!new_slot) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here for log(), anything from host could be wrong.
return; | ||
|
||
struct io_perf_monitor_ctx *self = &perf_monitor_ctx; | ||
//k_spinlock_key_t key = k_spin_lock(&self->lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
locking needed ? If multicore you could use an atomic add increment ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given "data" is 64bit, atomics won't work, but it would seem locking is probably not needed given a single slot is only used for one interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if only one instance uses a specific slot this should be ok. I'm not 100% sure if this was the original design but it is a thing now. I did add a comment in .h now to make this apparent.
@@ -88,6 +90,11 @@ static void idc_handler(struct k_p4wq_work *work) | |||
idc->received_msg.header = msg->header; | |||
idc->received_msg.extension = msg->extension; | |||
|
|||
#ifdef CONFIG_SOF_TELEMETRY_IO_PERFORMANCE_MEASUREMENTS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would avoid the #ifdefs in client to maintain readability, i.e. the telemetry header could have the #ifdef and just declare some empty static inline stubs if its not enabled in Kconfig
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple minor comments. The locking is a bit hard to follow, but otherwise looks ok.
return; | ||
|
||
struct io_perf_monitor_ctx *self = &perf_monitor_ctx; | ||
//k_spinlock_key_t key = k_spin_lock(&self->lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given "data" is 64bit, atomics won't work, but it would seem locking is probably not needed given a single slot is only used for one interface.
|
||
item->is_removed = true; | ||
|
||
int idx = (item - self->io_perf_data) / sizeof(*item); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this be one place where locking is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added lock for this line alone, we'll see if we get any problems in smoke tests. This function did deadlock when i locked all the code here, but it's probably because bitmap has it's own locks too.
55dab6c
to
dea7611
Compare
|
||
ret = perf_bitmap_setbit(&self->io_performance_data_bitmap, idx); | ||
if (ret < 0) | ||
return NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
free the bitmap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If setbit fails, free also fails (wrong idx)
item->is_removed = true; | ||
|
||
key = k_spin_lock(&self->lock); | ||
idx = (item - self->io_perf_data) / sizeof(*item); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks wrong - the subtraction returns the number of structures, not bytes
|
||
ret = perf_bitmap_clearbit(&self->io_performance_data_bitmap, idx); | ||
if (ret < 0) | ||
return ret; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is questionable - whether you should still try to free the bitmap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only fails if idx is wrong - "free" will also fail in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and idx
can only be wrong if item
is wrong, and you've already used it above to set item->is_removed = true
. So, if item
can really be wrong, can we check it first before modifying anything? And if it cannot be wrong, then we should probably handle "impossible" cases differently here - either just print a big scary error message or even use an assert()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there's no check for null., and there should be one, added. As item
is within a certain memory range we could check if it is within it also.
} | ||
|
||
static int | ||
io_perf_monitor_stopped(struct io_perf_monitor_ctx *self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this never returns errors, so can be void
. Besides the name sounds like it's checking whether the monitor has stopped, while this isn't what this function is doing
#endif | ||
} | ||
|
||
static int io_global_perf_data_get(uint32_t *data_off_size, char *data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why char
? what does it have to do with characters / strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better question would be: why all the other handlers have this kind of interface
@@ -317,6 +336,11 @@ static enum task_state ipc_do_cmd(void *data) | |||
{ | |||
struct ipc *ipc = data; | |||
|
|||
#ifdef CONFIG_SOF_TELEMETRY_IO_PERFORMANCE_MEASUREMENTS | |||
/* Increment performance counters */ | |||
io_perf_monitor_update_data(ipc->io_perf_in_msg_count, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why IPC commands are counted for performance? In some use-cases there are many IPCs, in others none, like plain playback or capture?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In ref. FW this feature did also count some events like IPC IDC and some interrupts.
@@ -481,6 +483,11 @@ static void idc_complete(void *data) | |||
uint32_t type = iTS(idc->received_msg.header); | |||
k_spinlock_key_t key; | |||
|
|||
#ifdef CONFIG_SOF_TELEMETRY_IO_PERFORMANCE_MEASUREMENTS | |||
/* Increment performance counters */ | |||
io_perf_monitor_update_data(idc->io_perf_out_msg_count, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, how does the number of IDCs describe IO performance?
io_perf_monitor_init_data(&dd->io_perf_bytes_count, &init_data); | ||
} | ||
#endif | ||
|
||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also in the INVALID
case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should treat performance as something optional. Perf setup failing should not make everything else fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, then maybe at least print a warning in the default
case at line 509 above - not a blocker, can be a follow up patch
dea7611
to
2cf375d
Compare
|
||
ret = perf_bitmap_clearbit(&self->io_performance_data_bitmap, idx); | ||
if (ret < 0) | ||
return ret; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and idx
can only be wrong if item
is wrong, and you've already used it above to set item->is_removed = true
. So, if item
can really be wrong, can we check it first before modifying anything? And if it cannot be wrong, then we should probably handle "impossible" cases differently here - either just print a big scary error message or even use an assert()
if (ret < 0) | ||
return ret; | ||
|
||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, this should've been
return perf_bitmap_free(&performance_data_bitmap, idx);
but if you agree to use asserts, then something like
ret = perf_bitmap_free(&performance_data_bitmap, idx);
assert(!ret);
return ret;
You could also consider return 0
but then if built with asserts disabled, the compiler might complain about assigned but not used ret
2cf375d
to
a611cc2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -433,7 +433,8 @@ int io_perf_monitor_init(void) | |||
} | |||
|
|||
/* init performance monitor using Zephyr */ | |||
SYS_INIT(io_perf_monitor_init, APPLICATION, CONFIG_APPLICATION_INIT_PRIORITY); | |||
/* IMPORTANT , this init is now in IPC init, temporarily */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, we should also state the transition plan in the comment so that everyone know when this will be resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this is connected to the io_perf_monitor_init comment at the top, not sure if we have a nice place for the init. Leaving it near ipc_init works, but it's a bit ugly.
If we don't have a better place for this init, I will just leave it near ipc_init for now, and merge this temporary commit to another, or rename it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments, but nothing blocking merge.
a611cc2
to
08b6164
Compare
Lot of failures, could have been lab power off event. rerun CI. |
SOFCI TEST |
Some CI fails are due to newly added asserts failing. I'll take a look. |
08b6164
to
38c8379
Compare
Adds config option for enabling I/O performance. Signed-off-by: Tobiasz Dryjanski <tobiaszx.dryjanski@intel.com>
Implements I/O performance measurement feature. It counts number of transmitted bytes or other events that happen in various interfaces. Signed-off-by: Tobiasz Dryjanski <tobiaszx.dryjanski@intel.com>
Adds IO_PERF_MEASUREMENTS_STATE and IO_GLOBAL_PERF_DATA IPCs. Those can be used to change state of I/O performance monitor and extract the measured data. Signed-off-by: Tobiasz Dryjanski <tobiaszx.dryjanski@intel.com>
Set up a measurement of number of input and output IPCs. Signed-off-by: Tobiasz Dryjanski <tobiaszx.dryjanski@intel.com>
Set up a counter of input and output IDCs. Signed-off-by: Tobiasz Dryjanski <tobiaszx.dryjanski@intel.com>
Adds I/O performance measurement for audio unterfaces in DAI (SNDW, DMIC, SSP, HDA). Signed-off-by: Tobiasz Dryjanski <tobiaszx.dryjanski@intel.com>
Enable I/O performance for mtl and lnl Signed-off-by: Tobiasz Dryjanski <tobiaszx.dryjanski@intel.com>
38c8379
to
f7d2315
Compare
Zephyr's bitarray had a problem when clearing the bit before freeing it. Not clearing it changes nothing in overall logic so we can just not do it. Also fixes and issue with computing bitmap idx. Signed-off-by: Tobiasz Dryjanski <tobiaszx.dryjanski@intel.com>
f7d2315
to
cade98a
Compare
Implements I/O performance measurement feature. It counts number of transmitted bytes or other events that happen in various interfaces. Currently measured interfaces: IPC, IDC, SNDW, HDA, SSP, DMIC.
There are some interfaces that are defined in Reference Firmware that apparently are not used directly here: USB, I2C, I3C, UART, CSI_2 , DTF. But I could not confirm this as of yet.
There are some other minor issues I will mention in comments.