Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intel_adsp/ace: power: pad the hpsram_mask passed to power_down #75285

Conversation

kv2019i
Copy link
Collaborator

@kv2019i kv2019i commented Jul 1, 2024

The power_down() function will lock dcache for the hpsram_mask array. On some platforms, the dcache lock will fail if the array is on cache line that can be used for window register context saves.

Work around this by padding the hssram_mask to a full cacheline.

Link: thesofproject/sof#9268

@@ -339,16 +339,16 @@ void pm_state_set(enum pm_state state, uint8_t substate_id)
(void *)rom_entry;
sys_cache_data_flush_range((void *)imr_layout, sizeof(*imr_layout));
#endif /* CONFIG_ADSP_IMR_CONTEXT_SAVE */
uint32_t hpsram_mask = 0;
uint32_t hpsram_mask[CONFIG_DCACHE_LINE_SIZE] = { 0 };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to align this by the cache line size too to ensure it doesn't cross a boundary. And the kconfig is (I think) in units of bytes, not dwords. Also style: while I get that we have a kconfig option for that for use by portable code, in this case I think clarity and safety demand you use the actual hardware value, e.g.

    __aligned(XCHAL_DCACHE_LINESIZE) uint32_t hpsram_mask[XCHAL_DCACHE_LINESIZE / sizeof(uint32_t)];

And a nitpick: no need to zero-fill it if you're only going to use one index out of the array that you're initializing yourself.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I also tested this yesterday - only without the oversight of using an array of 64 32-bit entries, but using 64 bytes / 4 == 16 entries. And yes, it works when using an array and aligning it on a cache-line size boundary. Only using one of them - either an array or aligning - didn't help. And we don't have a good explanation for this either, right? So, this introduces a 60 byte overhead - but only when powering down. Whereas my similarly "accidentally fixing" version with static has an overhead of 4 bytes - but for the whole life-time of the firmware. So, we can choose between the two

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andyross @lyakh , corrected based on feedback.

I did further tests and it's indeed the stack setup of the calling function pm_state_set() that is the triggering condition. As the exact conditions are not known, adding both alignment and size makes sense. This also makes sense as with call8 convention, there is window save space both before and after stack variables. With this change, we ensure any register save/restores happen on a different cachelines than the one we use for hp_sram.

@kv2019i
Copy link
Collaborator Author

kv2019i commented Jul 2, 2024

V2 pushed, marking as ready for review.

@kv2019i kv2019i added the platform: Intel ADSP Intel Audio platforms label Jul 2, 2024
@kv2019i kv2019i force-pushed the 202407-fix-powerdown-dcache-try2 branch 2 times, most recently from 45d1980 to 00e36e0 Compare July 2, 2024 09:26
@kv2019i kv2019i marked this pull request as ready for review July 2, 2024 09:28
@kv2019i
Copy link
Collaborator Author

kv2019i commented Jul 2, 2024

V3:

  • fixed conformance fail by splitting the one 100+ line in the patch, no other changes

kv2019i added a commit to kv2019i/sof that referenced this pull request Jul 2, 2024
@kv2019i kv2019i added the bug The issue is a bug, or the PR is fixing a bug label Jul 2, 2024
@kv2019i
Copy link
Collaborator Author

kv2019i commented Jul 2, 2024

Tested on actual devices at thesofproject/sof#9274

@kv2019i kv2019i force-pushed the 202407-fix-powerdown-dcache-try2 branch from 00e36e0 to 22dd1eb Compare July 2, 2024 09:46
@kv2019i
Copy link
Collaborator Author

kv2019i commented Jul 2, 2024

V4:

  • once more compliance check fix -- add empty line after declarations at 343, no other changes

The power_down() function will lock dcache for the hpsram_mask
array. On some platforms, the dcache lock will fail if the array
is on cache line that can be used for window register context
saves.

Work around this by aligning and padding the hpsram_mask to cacheline
size.

Link: thesofproject/sof#9268
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
@kv2019i kv2019i force-pushed the 202407-fix-powerdown-dcache-try2 branch from 22dd1eb to d8fcc20 Compare July 2, 2024 12:43
@kv2019i
Copy link
Collaborator Author

kv2019i commented Jul 2, 2024

V5:

  • the compliance check really doesn't like my code... now "rewording" how I express the alignment, let's see if check is happy now

Copy link
Contributor

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks as clean as can be expected. Still not a root cause, but "feels like the right way to do it".

@dcpleung dcpleung added this to the v3.7.0 milestone Jul 2, 2024
@kv2019i
Copy link
Collaborator Author

kv2019i commented Jul 3, 2024

#75108 merged instead, closing this one (it won't apply anymore).

@kv2019i kv2019i closed this Jul 3, 2024
kv2019i added a commit to kv2019i/sof that referenced this pull request Aug 13, 2024
kv2019i added a commit to kv2019i/sof that referenced this pull request Aug 13, 2024
kv2019i added a commit to kv2019i/sof that referenced this pull request Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug platform: Intel ADSP Intel Audio platforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants