-
Notifications
You must be signed in to change notification settings - Fork 6.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
intel_adsp/ace: power: pad the hpsram_mask passed to power_down #76046
intel_adsp/ace: power: pad the hpsram_mask passed to power_down #76046
Conversation
9268 was supposedly fixed and has been closed for a while. If the previous attempts were not enough then you must link to them and reference the corresponding commits in this commit message with BTW is this supposed to help with thesofproject/sof#9308 ? |
thesofproject/sof#9309 didn't work |
Ok, thanks to @lyakh now I see that this is just a re-submission of https://github.com/kv2019i/zephyr/commits/202407-fix-powerdown-dcache-try2/ which was a previous (and not merged) attempt to fix thesofproject/sof#9268. Still, it's very confusing to see a fix for a closed bug... Was this meant as a draft? |
@marc-hb @lyakh I apologize, gentlemen, but this change was not intended to improve anything for the MTL (it's about the PTL, but I will describe that later). Thank you, however, for your interest and your contribution.
Yes, it probably should have been a draft, but I decided to make it an open PR right away because I know that it would not escape your attention anyway. I did not include explanations because I was busy debugging problems on the MTL. Best regards. |
Drafts achieve (at least) 3 things:
Drafts are great, please use and abuse them! If you want 2. and 3. but you still want to notify people then very easy: click "ready to review" and then back to "convert to draft". This notifies everyone and then does NOT unsubscribe anyone. |
@marc-hb @lyakh I have updated the description, so the PR is ready for review. I would maybe wait with the merge until we fix issue thesofproject/sof#9308. When the fix is merged, I will do a rebase and update the PR on the SOF side. |
/* This assumes a single HPSRAM segment */ | ||
static uint32_t hpsram_mask; | ||
const int dcache_words = XCHAL_DCACHE_LINESIZE / 4; | ||
uint32_t hpsram_mask[dcache_words] __aligned(XCHAL_DCACHE_LINESIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this moves it back to the stack, which by itself already moves it away from any potentially neighbouring cache lines... TBH I'm now confused - this more or less reverts #75108 or, in other words, switches to an alternative fix for the same issue by reviving #75285. Would be good to get some more information
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving the variable to the bss causes an exception on PTL. We still have the same exception (LoadStoreErrorCause(3)), just on a different platform. The problem still occurred on LNL and stopped after a rebase that did not introduce changes affecting that area.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still feels closer to a root cause than the first, so +1 from me. It still worries me that we don't have a real answer.
Did anyone ever get a chance to whitebox the failure? Like, try to lock 1, 2, 3... cache lines at the same index and see when/how it fails? Then iterate over all 256 (or whatever) indices and see if there's one particular one that fails early (because, maybe, the ROM left something locked?)
b767597
to
be73b43
Compare
Rebase |
519ebc1
to
3d2051b
Compare
The power_down() function will lock dcache for the hpsram_mask array. On some platforms, the dcache lock will fail if the array is on cache line that can be used for window register context saves. Work around this by aligning and padding the hpsram_mask to cacheline size. Link: thesofproject/sof#9268 Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
I've spent some time on this problem recently and unfortunately, I haven't gotten to the bottom of it yet. For now, I know that alignment and padding are the only solutions that have worked on all platforms so far. What I plan to do now is to merge this patch, add D3 tests for PTL to CI, and then rework |
This PR contains only a cherry-pick of the fix that was proposed and reviewed earlier in this #75285 PR. The reason I am pushing it again is that the change that was merged does not completely fix the problem.
Recently, the same issue that we identified on the MTL could be observed in another PR on SOF in builds for LNL (the problem stopped reproducing after a rebase and so far has not reappeared on LNL). The same issue still reproduces on PTL and without some kind of fix, we are unable to add tests to CI that check if this type of power flow works. The change proposed by @kv2019i is so far the only solution that reliably fixes this problem on all platforms.
I apologize for taking so long with the explanations, I had to focus on other tasks.