-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DNM] schedule: add Tasks With Budget scheduler type #9075
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First part of review
ffff647
to
e8ff738
Compare
@@ -210,6 +210,10 @@ if (CONFIG_SOC_SERIES_INTEL_ADSP_ACE) | |||
${SOF_SRC_PATH}/schedule/zephyr_dp_schedule.c | |||
) | |||
|
|||
zephyr_library_sources_ifdef(CONFIG_ZEPHYR_TWB_SCHEDULER | |||
${SOF_SRC_PATH}/schedule/zephyr_twb_schedule.c | |||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this down to src/schedule/CMakeLists.txt
, see recent example in 0419b7c
@@ -56,6 +56,18 @@ config ZEPHYR_DP_SCHEDULER | |||
DP modules can be located in dieffrent cores than LL pipeline modules, may have | |||
different tick (i.e. 300ms for speech reccognition, etc.) | |||
|
|||
config ZEPHYR_TWB_SCHEDULER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this down to src/schedule/Kconfig
, see recent example in 0419b7c
@abonislawski can you explain how one would set a "budget" for things that are dependent on the host interaction. And second, where is this budget stored? In the host driver, topology, something else? |
@plbossart my understanding is this is "general budget" for all IPC's, not for a specific IPC (like create pipeline), let's say we don't want to block other tasks (like DP) for more than 10% of systick time so there is no need to calculate it in great detail.
I didn't switched IPC task to TWB yet but based on the above (single general budget) Im expecting single define somewhere in FW in the first version @mwasko please comment from the architecture perspective if my understanding is wrong |
That is correct. Today in SOF we must decide if IPCs have higher priority then data processing modules and risk that sequence of host IPCs will cause audio glitches or the other way around and risk IPC timeouts due to heavy DSP load. Task with Budget plays a role of a guard that guarantee in each processing cycle a budget of MCPS that can be used for IPC processing in high priority and if it is exceeded the priority is lowered to let the critical audio processing to complete. |
Pushed small update with:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, it sounds like we need a test pipeline ? @abonislawski what do you have in mind for this ?
// Temporary fixed numbers, will be removed when PR ready to merge | ||
if (pdata->cycles_consumed < pdata->cycles_granted*3200) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tmp numbers, where would these come from in real life ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zephyr kconfig, the problem is the unit difference between timeslice api (CONFIG_SYS_CLOCK_TICKS_PER_SEC) and thread statistics api (CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC). At first I was thinking to change zephyr api and just left fixed numbers for test purposes but now I think we will just recalculate this using zephyr kconfig.
@lgirdwood I will switch IPC or IDC task to TWB in this PR so it should be heavily tested in current CI There is also one timeslice change required in Zephyr, already merged here (waiting for west update): |
Ok, IPC should be a good test. One thing to check for IDC is that this change wont impact timing around Zephyr mutex() APIs that use IDC on multicore configs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need an explanation why Zephyr slicing isn't enough
|
||
config LL_THREAD_PRIORITY | ||
int "LL thread cooperative high priority" | ||
default -16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This adds a risk that if someone changes CONFIG_NUM_COOP_PRIORITIES
we might forget to change this one. Can we at least add an assert somewhere? In fact maybe even you can use arithmetics here directly and put -NUM_COOP_PRIORITIES
? I'm not sure if this would work but at least I did find one example in SOF
config MULTICORE
bool
default CORE_COUNT > 1
and one in Zephyr:
config BUILD_OUTPUT_ADJUST_LMA
depends on SECOND_CORE_MCUX && SOC_LPC54114_M0
default "-0x20010000+\
$(dt_chosen_reg_addr_hex,$(DT_CHOSEN_Z_CODE_CPU1_PARTITION))"
and an even more elaborate calculation in Zephyr:
FLASH_CHOSEN := zephyr,flash
FLASH_BASE := $(dt_chosen_reg_addr_hex,$(FLASH_CHOSEN))
FLEXSPI_BASE := $(dt_node_reg_addr_hex,/soc/spi@402a8000,1)
config BUILD_OUTPUT_ADJUST_LMA
default "$(FLEXSPI_BASE) - $(FLASH_BASE)" if NXP_IMX_RT_ROM_RAMLOADER
So, maybe that would work! But at the very least we need asserts!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to do it this way but it failed when I added minus sign to default value.
But based on your examples it should work so I will check again
/** | ||
* \brief max budget limit | ||
*/ | ||
#define ZEPHYR_TWB_BUDGET_MAX (CONFIG_SYS_CLOCK_TICKS_PER_SEC / LL_TIMER_PERIOD_US) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was taught at school to verify dimensions: "ticks / second" / "microseconds" gives effectively ticks per seconds squared. Something isn't right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh, you are right, it worked just because LL_TIMER_PERIOD_US is 1000 anyway
definitely I will fix this
task->state != SOF_TASK_STATE_COMPLETED) { | ||
scheduler_twb_unlock(lock_key); | ||
return -EINVAL; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd put the validity check at the top as an if
, not as an else if
and you don't need to lock yet, only after line 252, when you start accessing global data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Early return"
if(bad)
return -EINVAL; // or goto fail:
// Entering main, non-indented, "safe" code
do_real_work()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reversed the logic but not sure about the lock @lyakh ? In DP we set lock at the very beginning so just before checking task->state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abonislawski I think the Linux kernel provides a good example of a healthy approach to locking. There every single lock must have a well defined purpose. They have a check (is it by checkpatch?) that verifies, that there's a comment accompanying every single lock definition. I.e. whenever a new lock is added it mush define what it is for. And specifically it must say what data it's protecting. And it should be clearly understood what contexts can raise for access to that data.
Same here - it should be well defined and understood what data this lock is protecting. Since this is a scheduler level lock, I assume, that it protects some scheduler global data, not every task state?
if (pdata->cycles_granted) { | ||
// Temporary fixed numbers, will be removed when PR ready to merge | ||
if (pdata->cycles_consumed < pdata->cycles_granted*3200) { | ||
budget_left = (pdata->cycles_granted*3200 - pdata->cycles_consumed)*0.0003125; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, could you explain? I read both the TWB and the Zephyr thread and scheduler slicing documentation and I don't understand where these calculations come into play in them? Could you maybe give an example why the simple native Zephyr slicing isn't enough? You say, that IPC and IDC are two examples of tasks, that need TWB. So, I assume, you want to handle cases when an IPC task blocks IDCs from being handled on this core or the other way round? Wouldn't just direct Zephyr thread time slicing solve this problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could do this with simple time slicing (with priority flipping in cb) but this PR is about implementing architecture described here for TWB:
https://thesofproject.github.io/latest/architectures/firmware/sof-zephyr/mpp_layer/mpp_scheduling.html
and this clearly is much more advanced logic which requires cycles calculations and some task running per LL tick.
Please take a look at figure 46, 47 and 48 which shows TWB flow, where we need to update cycles and what to do in LL tick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then maybe that should be explained instead of using IDC and IPC as examples?
if (thread_priority < 0) { | ||
tr_err(&twb_tr, "scheduler_twb_task_init(): non preemptible priority"); | ||
return -EINVAL; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should you also check that TWB hasn't been initialised on this core yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should but I checked other schedulers and they are not checking this so probably because scheduler init fail should block core init
This will allow to organize thread priorities Signed-off-by: Adrian Bonislawski <adrian.bonislawski@intel.com>
TwB for medium priority processing (e.g., IPC/IDC message handling), - each TwB is scheduled as a separate preemptive thread, - TwB has assigned budget for processing that is refreshed in each sys tick, - TwB priority is dropped to low when budget is consumed More details: https://thesofproject.github.io/latest/architectures/firmware/sof-zephyr/mpp_layer/mpp_scheduling.html Signed-off-by: Adrian Bonislawski <adrian.bonislawski@intel.com>
Enable TWB for ACE1.5 Signed-off-by: Adrian Bonislawski <adrian.bonislawski@intel.com>
This will change scheduler type for IPC task Signed-off-by: Adrian Bonislawski <adrian.bonislawski@intel.com>
CI failed on MTL because it is using debug build and assert failed:
@nashif how hard is this limitation? Because this code actually works in normal build (without asserts) |
For debug build problems: Zephyr fix provided to unblock priority setting in ISR: The remaining problem for Threads may not be created in ISRs is creating LL task which creates another thread in ISR and actually this LL task triggers the assert. |
@abonislawski did you get a chance to resolve the issue ? |
Tasks With Budget Scheduler
TWB scheduler is a scheduler that creates a separate preemptible Zephyr thread for each SOF task that has pre-allocated MCPS budget renewed with every system tick. When the task is ready to run, then depending on the budget left in the current system tick, either MEDIUM_PRIORITY or LOW_PRIORITY is assigned to task thread. The latter allows for opportunistic execution if there is no other ready task with a higher priority while the budget is already spent.
Examples of tasks with budget: Ipc Task, Idc Task.
Task with Budget (TWB) has two key parameters assigned:
The number of cycles consumed is being reset to 0 at the beginning of each system_tick, renewing TWB budget. When the number of cycles consumed exceed cycles granted, the task is switched from MEDIUM to LOW priority. When the task with budget thread is created the MPP Scheduling is responsible to set thread time slice equal to task budget along with setting callback on time slice timeout. Thread time slicing guarantee that Zephyr scheduler will interrupt execution when the budget is spent, so MPP Scheduling timeout callback can re-evaluate task priority.
If there is a budget left in some system tick (task spent less time or started executing close to the system tick that preempts execution), it is reset and not carried over to the next tick.
Details: https://thesofproject.github.io/latest/architectures/firmware/sof-zephyr/mpp_layer/mpp_scheduling.html
Zephyr required patch:
Todo: