Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipc: add cache flushing and invalidation for IPC data #9630

Conversation

tmleman
Copy link
Contributor

@tmleman tmleman commented Oct 31, 2024

This patch addresses an issue with incorrect IPC responses due to the lack of cache flushing and invalidation on secondary cores.

The following changes have been made:

  1. Added cache writeback for IPC message data in ipc_msg_send when the current core is not the primary core.
  2. Added cache invalidation for IPC message data in ipc_prepare_to_send before writing to the mailbox.

These changes ensure that the IPC data is correctly synchronized between cores.

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, see comment inline.

@@ -214,6 +214,9 @@ void ipc_msg_send(struct ipc_msg *msg, void *data, bool high_priority)
msg->tx_data != data) {
ret = memcpy_s(msg->tx_data, msg->tx_size, data, msg->tx_size);
assert(!ret);
if (!cpu_is_primary(cpu_get_id()))
dcache_writeback_region((__sparse_force void __sparse_cache *)msg->tx_data,
msg->tx_size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good catch. I wonder if a better fix would be to just allocate tx_data as uncached.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible, but it probably comes with a performance cost. It's hard for me to determine what is better. However, invalidating each message also seems problematic to me, and I considered adding some flag in the structure that would indicate that the payload should be invalidated before copying.

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmleman good stuff - can you put an inline comment before each cache op here to help everyone follow the flows.

@lyakh
Copy link
Collaborator

lyakh commented Nov 4, 2024

@tmleman any idea why we haven't seen these failures in CI? Is the contents of received IPCs never verified?

@tmleman
Copy link
Contributor Author

tmleman commented Nov 4, 2024

any idea why we haven't seen these failures in CI? Is the contents of received IPCs never verified?

It seems to me that this is because most of the large_config_get messages are processed on the main core, and the responses to messages that were sent to secondary cores only return a status informing whether the operation was successful (messages such as module_init or set_pipeline_state).

@tmleman tmleman force-pushed the topic/upstream/pr/ipc4/secondary_core_response branch 2 times, most recently from de9a0d9 to bdbe8ce Compare November 4, 2024 12:21
@@ -72,9 +72,9 @@ struct ipc4_msg_data {
static struct ipc4_msg_data msg_data;

/* fw sends a fw ipc message to send the status of the last host ipc message */
static struct ipc_msg msg_reply = {0, 0, 0, 0, LIST_INIT(msg_reply.list)};
static struct ipc_msg msg_reply = {0, 0, 0, 0, false, LIST_INIT(msg_reply.list)};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are those static structures ever cleaned? If not, once is_shared is set to true it'll never be set to back false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that was a good catch. It seems to me that I managed to cover all cases now.

@tmleman tmleman force-pushed the topic/upstream/pr/ipc4/secondary_core_response branch 2 times, most recently from ee2bb7b to e43f8c4 Compare November 4, 2024 20:34
This patch addresses an issue with incorrect IPC responses due to the
lack of cache flushing and invalidation on secondary cores.

The following changes have been made:

1. Added cache writeback for IPC message data in `ipc_msg_send` when the
   current core is not the primary core.
2. Added cache invalidation for IPC message data in
   `ipc_prepare_to_send` before writing to the mailbox.

These changes ensure that the IPC data is correctly synchronized between
cores.

Signed-off-by: Tomasz Leman <tomasz.m.leman@intel.com>
Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI passing with multicore tests.

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still wonder whether it would be overall better to just not use cached alloc for tx_data given IPC processing is not on the DSP hot path. OTOH, with the current approach (with cached used), this PR is needed.

@lgirdwood lgirdwood merged commit 79ca991 into thesofproject:main Nov 5, 2024
45 of 47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants