Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runqueue: allocate threads to >= 1 cores #248

Closed
wants to merge 12 commits into from

Conversation

elenaf9
Copy link
Collaborator

@elenaf9 elenaf9 commented Apr 10, 2024

See #244.

Add support for multicore to the runqueue.

The current allocation for each core is stored in a new array so that any call to get_next_for_core remains minimal effort.

The allocation is updated after each change in the runqueue. The reallocation currently has complexity O(n) because we a) need to get n threads for the n cores and b) need to iterate through the lists of new and old allocations to compare what has changed. Given that we have a very low number of cores I'd say this is okay. Wdyt?

I also had to add a new method del to the CList because with multiple running threads we can't assume that the deleted thread is always the head.

Support multiple cores in the runqueue.
The current allocation for each core is stored in a new array.
Thus, any call to `get_next_for_core` is minimal effort.
The allocation is updated after each change in the runqueue.
@elenaf9 elenaf9 linked an issue Apr 10, 2024 that may be closed by this pull request
@kaspar030
Copy link
Collaborator

kaspar030 commented Apr 11, 2024

The "unsupported generic default" hax error is being tackled here: hacspec/hax#597

@kaspar030 kaspar030 self-assigned this Apr 11, 2024
@kaspar030
Copy link
Collaborator

The "unsupported generic default" hax error is being tackled here: hacspec/hax#597

This got fixed quickly upstream, nice. :)

Here's how to work around the u8 matching hax error: hacspec/hax#598 (comment)

Clean code & improve readability.
Optimize case where `N_CORES` == 1.
@elenaf9
Copy link
Collaborator Author

elenaf9 commented Apr 12, 2024

The "unsupported generic default" hax error is being tackled here: hacspec/hax#597

This got fixed quickly upstream, nice. :)

Here's how to work around the u8 matching hax error: hacspec/hax#598 (comment)

Thanks!

I fixed the previous errors by using early return and rust iterator. Now a new error happens. I opened hacspec/hax#603.

In the multicore scenario, a yielding thread isn't necessarily the head
of the runqueue.
This commit adds a method `advance_from` to yield from any thread in the
queue. The thread is then moved to the tail of the runqueue.
Remove unused method `advance_head` (duplication of
`Runqueue::advance`).
Handle case N_CORES == 1 separately.
@elenaf9
Copy link
Collaborator Author

elenaf9 commented Apr 17, 2024

So I did some benchmarks to test performance for single core with this PR. I used bench_sched_yield, on nrf52840dk.

  • Before this PR, it took 254 ticks per context switch
  • With this PR, it increased significantly, to 320 ticks (when testing the status that I pushed last week, namely commit (65e4460).
  • With dcf08c1, I got it back down to 265 ticks by handling the case N_CORES==1 separately in all relevant methods

I am not really happy with the current solution, but need to somehow differentiate between single and multicore because for multicore much more cases have to be taken into account/ handled.
I also tried using traits to manage different method implementations for single and multicore, but Rust doesn't allow to implement something for N_CORES != 1. We could for now just decide to support dual core in the multicore case, which would probably also allow optimize the multicore implementations here further... Wdyt?

@elenaf9
Copy link
Collaborator Author

elenaf9 commented Apr 17, 2024

Additional note regarding performance:
Although the ticks per context switch will probably increase with the PR (even with further optimizations) I think it could still be an improvement to the overall performance.
Right now, the scheduler is always triggered if the state of the runqueue changed. With this PR, the calls that modify the runqueue now return the info whether the thread that is running in the core changed or not. So we can only trigger the scheduler if there will actually be a context switch.

Add new `GlobalRunqueue` trait to differentiate and optimize for
different number of cores on the type level.

It uses the `min_specialization` rust feature to have a generic
implementation for _n_ number of cores and fine-grained specialization
of methods for concrete _n_.
@elenaf9
Copy link
Collaborator Author

elenaf9 commented Apr 19, 2024

I also tried using traits to manage different method implementations for single and multicore, but Rust doesn't allow to implement something for N_CORES != 1. We could for now just decide to support dual core in the multicore case, which would probably also allow optimize the multicore implementations here further... Wdyt?

I now found a solution for this using the rust min_specialization feature. With this, there is now a default implementation for the relevant methods for generic N_CORES. The default implementations can then be overwritten for specific N_CORES, which I did for N_CORES == 1 and N_CORES == 2 (see 441611e).
When testing this on multicore (using #241), it resulted in much better performance than the previous implementation thanks to the now possible optimizations for N_CORES == 2.

@elenaf9 elenaf9 marked this pull request as draft April 25, 2024 10:00
@elenaf9
Copy link
Collaborator Author

elenaf9 commented Apr 25, 2024

(Converted back to draft because of an idea for an alternative implementation that just came to my mind).

@elenaf9
Copy link
Collaborator Author

elenaf9 commented Sep 5, 2024

Closing because I decided to for an alternative implementation of the multicore scheduler, where the runqueue itself doesn't need to support multicore.

@elenaf9 elenaf9 closed this Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

riot-rs-runqueue: support multiple currently running threads
3 participants