Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LNL]Suspend resume with playback or capture will Fail or TIMEOUT #5080

Open
ssavati opened this issue Jun 24, 2024 · 46 comments
Open

[LNL]Suspend resume with playback or capture will Fail or TIMEOUT #5080

ssavati opened this issue Jun 24, 2024 · 46 comments
Assignees
Labels
bug Something isn't working LNL Applies to Lunar Lake platform suspend resume Issues related to suspend resume (e.g. rtcwake)

Comments

@ssavati
Copy link

ssavati commented Jun 24, 2024

Suspend resume with playback or capture will Fail or TIMEOUT

Steps to reproduce

  • Use NOCODEC/HDA/SDW issue sporadically observed on all devices. below cmd are for nocodec
  • TPLG=/lib/firmware/intel/development/sof-lnl-nocodec.tplg MODEL=LNLM_RVP_NOCODEC SOF_TEST_INTERVAL=5 ~/sof-test/test-case/check-suspend-resume-with-audio.sh -l 5 -m playback
    or
    TPLG=/lib/firmware/intel/development/sof-lnl-nocodec.tplg MODEL=LNLM_RVP_NOCODEC SOF_TEST_INTERVAL=5 ~/sof-test/test-case/check-suspend-resume-with-audio.sh -l 5 -m capture

The test will either fail or hang

Logs
console_logs.txt
dmesg_logs.txt

Configs
Linux Commit:
d25788054d59
KConfig Branch:
master
KConfig Commit:
c3171afedc63
SOF Branch:
stable-v2.10
SOF Commit:
b15f1f1a3238

cc:

@kv2019i kv2019i added bug Something isn't working LNL Applies to Lunar Lake platform labels Jun 24, 2024
@kv2019i
Copy link
Collaborator

kv2019i commented Jun 24, 2024

Failure logs are a bit unclear w.r.t. where things start to go south. The last messages to FW seem sane, cores are powered down, finally SET_DX is sent without errors. But then there's a generic kernel pm -EBUSY error for late_suspend (one example, Intel test run 43010):

[  408.720709] kernel: snd_sof_intel_hda_common:hda_dsp_state_log: sof-audio-pci-intel-lnl 0000:00:1f.3: Current DSP power state: D3
[  408.720720] kernel: snd_sof:sof_set_fw_state: sof-audio-pci-intel-lnl 0000:00:1f.3: fw_state change: 7 -> 0
[  412.467352] kernel: PM: late suspend of devices failed

Running more tests to get insights on the failures.

@kv2019i
Copy link
Collaborator

kv2019i commented Jun 26, 2024

Digging further into scenario #5085 addresses, it seems following combination of events happens:

  • deep buffer PCM enters D0i3x state
  • xrun happens just at suspend, so arecord is doing xrun handling (PCM is stopped, prepared and started again after resume)
  • as the PCM is in stopped (due to the xrun), SNDRV_PCM_TRIGGER_SUSPEND is not sent to DAI, and the NULL deref can be hit

Example log

[  882.643397] DEBUG: cmd=1 dai SSP0 Pin direction 0
# KV: deep buffer PCM31 started
[  882.648298] snd_sof:sof_pcm_trigger: sof-audio-pci-intel-lnl 0000:00:1f.3: pcm: trigger stream 31 dir 0 cmd 1
[  882.648307] snd_sof:sof_ipc4_trigger_pipelines: sof-audio-pci-intel-lnl 0000:00:1f.3: trigger cmd: 1 state: 4
[  882.648315] snd_sof:sof_ipc4_set_pipeline_state: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc4 set pipeline instance 0 state 3
[  882.648322] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx      : 0x13000003|0x0: GLB_SET_PIPELINE_STATE
[  882.648759] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx reply: 0x33000000|0x0: GLB_SET_PIPELINE_STATE
[  882.648788] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx done : 0x13000003|0x0: GLB_SET_PIPELINE_STATE
[  882.648795] snd_sof:sof_ipc4_set_pipeline_state: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc4 set pipeline instance 0 state 4
[  882.648800] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx      : 0x13000004|0x0: GLB_SET_PIPELINE_STATE
[  882.649963] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx reply: 0x33000000|0x0: GLB_SET_PIPELINE_STATE
[  882.649985] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx done : 0x13000004|0x0: GLB_SET_PIPELINE_STATE
[  882.649997] snd_sof_intel_hda_common:hda_dsp_stream_trigger: sof-audio-pci-intel-lnl 0000:00:1f.3: FW Poll Status: reg[0x1e0]=0x2014001e successful
[  884.689120] asix 3-3:1.0 enx34298f712491: Link is Up - 100Mbps/Full - flow control rx/tx
# KV: opportunistic D0iX is attempt at 887sec
[  887.690497] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx      : 0x48000000|0x12: MOD_SET_D0IX
[  887.691066] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx reply: 0x68000000|0x12: MOD_SET_D0IX
[  887.691109] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx done : 0x48000000|0x12: MOD_SET_D0IX
[  887.691117] snd_sof_intel_hda_common:hda_dsp_state_log: sof-audio-pci-intel-lnl 0000:00:1f.3: Current DSP power state: D0I3
# KV: suspend process is started soon after
[  887.983323] PM: suspend entry (s2idle)
[  887.985028] Filesystems sync: 0.001 seconds
# KV: user-space see xruns and stops the PCM as part of xrun handlng, user-space not yet frozen
[  888.121816] snd_sof:sof_pcm_trigger: sof-audio-pci-intel-lnl 0000:00:1f.3: pcm: trigger stream 31 dir 0 cmd 0
[  888.121825] snd_sof:sof_ipc4_trigger_pipelines: sof-audio-pci-intel-lnl 0000:00:1f.3: trigger cmd: 0 state: 3
[  888.121830] snd_sof:sof_ipc4_set_pipeline_state: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc4 set pipeline instance 0 state 3
[  888.121915] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx      : 0x48000000|0x4: MOD_SET_D0IX
[  888.122275] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx reply: 0x68000000|0x4: MOD_SET_D0IX
[  888.122288] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx done : 0x48000000|0x4: MOD_SET_D0IX
[  888.122290] snd_sof_intel_hda_common:hda_dsp_state_log: sof-audio-pci-intel-lnl 0000:00:1f.3: Current DSP power state: D0I0
[...]
[  889.222280] snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-lnl 0000:00:1f.3: widget pipeline.2 freed
[  889.222285] snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-lnl 0000:00:1f.3: widget dai-copier.SSP.NoCodec-0.playback freed
# KV: much later hda_dai_suspend() is called with hext_stream non-NULL and this will kill the kernel
[  889.262695] DEBUG: dai suspend 13a0f000

PR #5085 will avoid the kernel crash, but the xrun handling concurrently with system suspend is not successful and the streaming later terminates to:

[  893.639609] snd_sof_intel_hda_common:hda_dai_trigger: sof-audio-pci-intel-lnl 0000:00:1f.3: cmd=1 dai SSP0 Pin direction 0
[  893.639621] sof-audio-pci-intel-lnl 0000:00:1f.3: ASoC: error at soc_dai_trigger on SSP0 Pin: -22
[  893.648704]  Deepbuffer Port0: ASoC: trigger FE cmd: 1 failed: -22

@plbossart
Copy link
Member

wow, that's quite a corner case: an xrun while suspend, no one saw that coming.

the part that I am not following is that the xrun seems to be reproducible, which suggests some sort of D0i3 problem in the first place.

Also if the PCM device is stopped before a system suspend, it's not clear to me what userspace will do on resume? Would it go on with a prepare/trigger? or would there be an additional stage of hw_free/hw_params?

@kv2019i
Copy link
Collaborator

kv2019i commented Jun 27, 2024

@plbossart I'll try to see if this is completely reproducible, but indeed the xrun seems to start the chain of failures. The user-space will continue xrun handling as soon as resume is complete:

# PCM31 is prepared just before user-space is suspended
[  889.160611] snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-lnl 0000:00:1f.3: widget pipeline.2 freed
[  889.160621] snd_sof:sof_widget_free_unlocked: sof-audio-pci-intel-lnl 0000:00:1f.3: widget dai-copier.SSP.NoCodec-0.playback freed
[  889.160638] snd_sof:sof_pcm_prepare: sof-audio-pci-intel-lnl 0000:00:1f.3: pcm: prepare stream 31 dir 0
[  889.160641] snd_sof:sof_pcm_hw_params: sof-audio-pci-intel-lnl 0000:00:1f.3: pcm: hw params stream 31 dir 0
# then we enter suspend
[  889.311599] ACPI: EC: interrupt blocked
[  893.102613] ACPI: EC: interrupt unblocked
# resume starts, DSP is booted
[  893.207454] snd_sof:snd_sof_run_firmware: sof-audio-pci-intel-lnl 0000:00:1f.3: firmware boot complete
[  893.207465] snd_sof:sof_set_fw_state: sof-audio-pci-intel-lnl 0000:00:1f.3: fw_state change: 6 -> 7
# system resume completes in kernel and resumes user-space
[  893.616716] OOM killer enabled.
[  893.616721] Restarting tasks ... 
# upon resume prepare is called AGAIN on the PCM31 (this is normal after system suspend)
[  893.618603] snd_sof:sof_pcm_prepare: sof-audio-pci-intel-lnl 0000:00:1f.3: pcm: prepare stream 31 dir 0
[  893.618613] snd_sof:sof_pcm_hw_params: sof-audio-pci-intel-lnl 0000:00:1f.3: pcm: hw params stream 31 dir 0
# to my eyes, the DSP pipeline setup goes just fine, no errors from the IPCs
# then FE PCM START trigger fals and I see no errors from SOF
[  893.639609] snd_sof_intel_hda_common:hda_dai_trigger: sof-audio-pci-intel-lnl 0000:00:1f.3: cmd=1 dai SSP0 Pin direction 0
[  893.639621] sof-audio-pci-intel-lnl 0000:00:1f.3: ASoC: error at soc_dai_trigger on SSP0 Pin: -22
[  893.648704]  Deepbuffer Port0: ASoC: trigger FE cmd: 1 failed: -22
# then we just see user-space tearing down the PCM

In user-space console, this shows as:

rtcwake: wakeup from "mem" using /dev/rtc0 at Wed Jun 26 19:20:20 2024
underrun!!! (at least 1028.938 ms long)
                                       2024-06-26 19:20:20 UTC Sub-Test: [COMMAND] sleep for 5
aplay: pcm_write:2127: write error: Input/output error

So opens are at least why the xrun occurs in the first place (deep buffer, D0i3 mode entry, ...), and second is the why the prepare is called twice and is the flow correct here (I have a hunch the double-prepare confuses the driver state somewhere and thus we get the -22 from FE trigger on the resume).

bardliao pushed a commit that referenced this issue Jun 27, 2024
When system enters suspend with an active stream, SOF core
calls hw_params_upon_resume(). On Intel platforms with HDA DMA used
to manage the link DMA, this leads to call chain of

   hda_dsp_set_hw_params_upon_resume()
 -> hda_dsp_dais_suspend()
 -> hda_dai_suspend()
 -> hda_ipc4_post_trigger()

A bug is hit in hda_dai_suspend() as hda_link_dma_cleanup() is run first,
which clears hext_stream->link_substream, and then hda_ipc4_post_trigger()
is called with a NULL snd_pcm_substream pointer.

Fixes: 2b009fa ("ASoC: SOF: Intel: hda: Unify DAI drv ops for IPC3 and IPC4")
Link: #5080
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
@kv2019i
Copy link
Collaborator

kv2019i commented Jun 27, 2024

Some refinements:

  • hit the condition (=suspend with substream present) with D0ix disabled (streams marked not compatible with D0ix)
  • hit the condition once without a preceding xrun

For the latter, I can't really explain why this happens yet. There's a trigger to stop the PCM right after suspend starts but nothing printed out by aplay so this is coming from ALSA core:

[47625.798898] PM: suspend entry (s2idle)
[47625.800453] Filesystems sync: 0.001 seconds
[47626.294808] snd_sof:sof_pcm_trigger: sof-audio-pci-intel-lnl 0000:00:1f.3: pcm: trigger stream 0 dir 0 cmd 0

@kv2019i
Copy link
Collaborator

kv2019i commented Jun 27, 2024

It seems the failure sequence is related to a system hickup in system process. Whenever SOF see unexpected sequence (no SUSPEND action for the DAI), there is unexpected 2sec gap in the system suspend process. This often generates an xrun, but exact sequence depends on the length of the gap:

# success case, completes within 10ms
[69489.712206] Filesystems sync: 0.001 seconds
[69489.717798] Freezing user space processes
[69489.719699] Freezing user space processes completed (elapsed 0.001 seconds)
# success case:
[69500.740018] Filesystems sync: 0.001 seconds
[69500.745193] Freezing user space processes
[69500.747031] Freezing user space processes completed (elapsed 0.001 seconds)
# success case:
[69511.757591] Filesystems sync: 0.001 seconds
[69511.761816] Freezing user space processes
[69511.763580] Freezing user space processes completed (elapsed 0.001 seconds)
# FAILURE case (without #5085 , kernel would crash here)
[69522.814773] Filesystems sync: 0.001 seconds
# note the 2second gap! user-space will stop the PCM after xrun and this is done before user-space is frozen
[69524.304214] Freezing user space processes
[69524.305612] Freezing user space processes completed (elapsed 0.001 seconds)

@kv2019i
Copy link
Collaborator

kv2019i commented Jun 28, 2024

With the kernel fix in place, we now see similar occurences in CI with suspend taking longer than expected and following error in log:

kernel:

[  334.703012] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx      : 0x13000003|0x0: GLB_SET_PIPELINE_STATE
[  335.206800] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc timed out for 0x13000003|0x0
[... kernel error messages]
[  335.207407] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx      : 0x46000002|0x3: MOD_UNBIND

FW:

[  334.702861] <inf> ipc: ipc_cmd: rx	: 0x13000003|0x0
[  334.702875] <inf> pipe: pipeline_trigger: pipe:0 0x0 pipe trigger cmd 2
[  334.702953] <inf> ll_schedule: zephyr_ll_task_done: task complete 0xa0119e00 0x20210U
[  334.702956] <inf> ll_schedule: zephyr_ll_task_done: num_tasks 2 total_num_tasks 2
[  335.207238] <inf> ipc: ipc_cmd: rx	: 0x46000002|0x3

So the SET_PIPELNE response is not seen by host. This only happens when the system suspend process has the unusual delay, so something unusual going on on the host system side. I'll keep the bug on driver side until we get more insight on what happens with the suspend.

@bardliao
Copy link
Collaborator

So the SET_PIPELNE response is not seen by host.

Is it possible that FW dead before it sent the response? Like, it dead when it process the MOD_UNBIND command.

plbossart pushed a commit that referenced this issue Jul 4, 2024
When system enters suspend with an active stream, SOF core
calls hw_params_upon_resume(). On Intel platforms with HDA DMA used
to manage the link DMA, this leads to call chain of

   hda_dsp_set_hw_params_upon_resume()
 -> hda_dsp_dais_suspend()
 -> hda_dai_suspend()
 -> hda_ipc4_post_trigger()

A bug is hit in hda_dai_suspend() as hda_link_dma_cleanup() is run first,
which clears hext_stream->link_substream, and then hda_ipc4_post_trigger()
is called with a NULL snd_pcm_substream pointer.

Fixes: 2b009fa ("ASoC: SOF: Intel: hda: Unify DAI drv ops for IPC3 and IPC4")
Link: #5080
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this issue Jul 9, 2024
When system enters suspend with an active stream, SOF core
calls hw_params_upon_resume(). On Intel platforms with HDA DMA used
to manage the link DMA, this leads to call chain of

   hda_dsp_set_hw_params_upon_resume()
 -> hda_dsp_dais_suspend()
 -> hda_dai_suspend()
 -> hda_ipc4_post_trigger()

A bug is hit in hda_dai_suspend() as hda_link_dma_cleanup() is run first,
which clears hext_stream->link_substream, and then hda_ipc4_post_trigger()
is called with a NULL snd_pcm_substream pointer.

Fixes: 2b009fa ("ASoC: SOF: Intel: hda: Unify DAI drv ops for IPC3 and IPC4")
Link: thesofproject#5080
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://patch.msgid.link/20240704085708.371414-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
@lgirdwood
Copy link
Member

lgirdwood commented Jul 10, 2024

@ssavati are you able to rerun test now this PR above has merged, if all good pls close. Thanks !

@ssavati
Copy link
Author

ssavati commented Jul 10, 2024

@ssavati are you able to rerun test now this PR above has merged, if all good pls close. Thanks !

Sure will run some functional and stress testcase related to Suspend resume

@ssavati
Copy link
Author

ssavati commented Jul 11, 2024

@lgirdwood still observed failures. for more visit CI #/result/planresultdetail/43738

@kv2019i
Copy link
Collaborator

kv2019i commented Jul 11, 2024

Ack @ssavati @lgirdwood I think this is expected. The kernel PR fixes the kernel crash, but there are still failure. The whole system doesn't hit panic anymore, but we hit a failure nevertheless (see comment #5080 (comment) ).

@marc-hb marc-hb added the suspend resume Issues related to suspend resume (e.g. rtcwake) label Jul 11, 2024
@lgirdwood
Copy link
Member

@kv2019i @ssavati what's the repro rate of the suspend delay failure ? The 2 second delay looks like we are blocked on pending IO completing. PCM core will copy to DMA buffers, but this may atomically block on audio HW ptr being updated too ?

johnny-mnemonic pushed a commit to linux-ia64/linux-stable-rc that referenced this issue Jul 14, 2024
[ Upstream commit 9065693 ]

When system enters suspend with an active stream, SOF core
calls hw_params_upon_resume(). On Intel platforms with HDA DMA used
to manage the link DMA, this leads to call chain of

   hda_dsp_set_hw_params_upon_resume()
 -> hda_dsp_dais_suspend()
 -> hda_dai_suspend()
 -> hda_ipc4_post_trigger()

A bug is hit in hda_dai_suspend() as hda_link_dma_cleanup() is run first,
which clears hext_stream->link_substream, and then hda_ipc4_post_trigger()
is called with a NULL snd_pcm_substream pointer.

Fixes: 2b009fa ("ASoC: SOF: Intel: hda: Unify DAI drv ops for IPC3 and IPC4")
Link: thesofproject/linux#5080
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://patch.msgid.link/20240704085708.371414-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
johnny-mnemonic pushed a commit to linux-ia64/linux-stable-rc that referenced this issue Jul 14, 2024
[ Upstream commit 9065693 ]

When system enters suspend with an active stream, SOF core
calls hw_params_upon_resume(). On Intel platforms with HDA DMA used
to manage the link DMA, this leads to call chain of

   hda_dsp_set_hw_params_upon_resume()
 -> hda_dsp_dais_suspend()
 -> hda_dai_suspend()
 -> hda_ipc4_post_trigger()

A bug is hit in hda_dai_suspend() as hda_link_dma_cleanup() is run first,
which clears hext_stream->link_substream, and then hda_ipc4_post_trigger()
is called with a NULL snd_pcm_substream pointer.

Fixes: 2b009fa ("ASoC: SOF: Intel: hda: Unify DAI drv ops for IPC3 and IPC4")
Link: thesofproject/linux#5080
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://patch.msgid.link/20240704085708.371414-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
mj22226 pushed a commit to mj22226/linux that referenced this issue Jul 15, 2024
[ Upstream commit 9065693 ]

When system enters suspend with an active stream, SOF core
calls hw_params_upon_resume(). On Intel platforms with HDA DMA used
to manage the link DMA, this leads to call chain of

   hda_dsp_set_hw_params_upon_resume()
 -> hda_dsp_dais_suspend()
 -> hda_dai_suspend()
 -> hda_ipc4_post_trigger()

A bug is hit in hda_dai_suspend() as hda_link_dma_cleanup() is run first,
which clears hext_stream->link_substream, and then hda_ipc4_post_trigger()
is called with a NULL snd_pcm_substream pointer.

Fixes: 2b009fa ("ASoC: SOF: Intel: hda: Unify DAI drv ops for IPC3 and IPC4")
Link: thesofproject#5080
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://patch.msgid.link/20240704085708.371414-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
johnny-mnemonic pushed a commit to linux-ia64/linux-stable-rc that referenced this issue Jul 15, 2024
[ Upstream commit 9065693 ]

When system enters suspend with an active stream, SOF core
calls hw_params_upon_resume(). On Intel platforms with HDA DMA used
to manage the link DMA, this leads to call chain of

   hda_dsp_set_hw_params_upon_resume()
 -> hda_dsp_dais_suspend()
 -> hda_dai_suspend()
 -> hda_ipc4_post_trigger()

A bug is hit in hda_dai_suspend() as hda_link_dma_cleanup() is run first,
which clears hext_stream->link_substream, and then hda_ipc4_post_trigger()
is called with a NULL snd_pcm_substream pointer.

Fixes: 2b009fa ("ASoC: SOF: Intel: hda: Unify DAI drv ops for IPC3 and IPC4")
Link: thesofproject/linux#5080
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://patch.msgid.link/20240704085708.371414-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
johnny-mnemonic pushed a commit to linux-ia64/linux-stable-rc that referenced this issue Jul 15, 2024
[ Upstream commit 9065693 ]

When system enters suspend with an active stream, SOF core
calls hw_params_upon_resume(). On Intel platforms with HDA DMA used
to manage the link DMA, this leads to call chain of

   hda_dsp_set_hw_params_upon_resume()
 -> hda_dsp_dais_suspend()
 -> hda_dai_suspend()
 -> hda_ipc4_post_trigger()

A bug is hit in hda_dai_suspend() as hda_link_dma_cleanup() is run first,
which clears hext_stream->link_substream, and then hda_ipc4_post_trigger()
is called with a NULL snd_pcm_substream pointer.

Fixes: 2b009fa ("ASoC: SOF: Intel: hda: Unify DAI drv ops for IPC3 and IPC4")
Link: thesofproject/linux#5080
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://patch.msgid.link/20240704085708.371414-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
johnny-mnemonic pushed a commit to linux-ia64/linux-stable-rc that referenced this issue Jul 15, 2024
[ Upstream commit 9065693 ]

When system enters suspend with an active stream, SOF core
calls hw_params_upon_resume(). On Intel platforms with HDA DMA used
to manage the link DMA, this leads to call chain of

   hda_dsp_set_hw_params_upon_resume()
 -> hda_dsp_dais_suspend()
 -> hda_dai_suspend()
 -> hda_ipc4_post_trigger()

A bug is hit in hda_dai_suspend() as hda_link_dma_cleanup() is run first,
which clears hext_stream->link_substream, and then hda_ipc4_post_trigger()
is called with a NULL snd_pcm_substream pointer.

Fixes: 2b009fa ("ASoC: SOF: Intel: hda: Unify DAI drv ops for IPC3 and IPC4")
Link: thesofproject/linux#5080
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://patch.msgid.link/20240704085708.371414-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
@kv2019i kv2019i assigned ranj063 and unassigned kv2019i Jul 16, 2024
@ranj063
Copy link
Collaborator

ranj063 commented Jul 22, 2024

I wonder what happens to the firmware state if it xruns before suspend and the context is saved. Not sure how well it can restart.... What would happens if we disable the context save @ranj063 ?

@plbossart it was my understanding that we do not enable CONTEXT_SAVE. Is that not true for LNL? @fredoh9 ?

@marc-hb
Copy link
Collaborator

marc-hb commented Jul 22, 2024

sof$ git grep CONFIG_ADSP_IMR_CONTEXT_SAVE

app/boards/intel_adsp_ace15_mtpm.conf:CONFIG_ADSP_IMR_CONTEXT_SAVE=n
app/boards/intel_adsp_ace20_lnl.conf:CONFIG_ADSP_IMR_CONTEXT_SAVE=y
app/boards/intel_adsp_ace30_ptl.conf:CONFIG_ADSP_IMR_CONTEXT_SAVE=y

thesofproject/sof@83343c5

commit 83343c51bb8703ba8fd2d5233dc8a74fc1443dce
Author:     Jaroslaw Stelter <Jaroslaw.Stelter@intel.com>
AuthorDate: Thu Jul 6 12:37:06 2023 +0200
Commit:     Liam Girdwood <lgirdwood@gmail.com>
CommitDate: Fri Jul 7 15:49:36 2023 +0100

    lnl: app: Fix configuration for D3 flow

    LNL configuration must be updated to fix D3 flow
    on ACE 2.0 platform.

    Signed-off-by: Jaroslaw Stelter <Jaroslaw.Stelter@intel.com>
---
app/boards/intel_adsp_ace20_lnl.conf | 2 ++
1 file changed, 2 insertions(+)

diff --git a/app/boards/intel_adsp_ace20_lnl.conf b/app/boards/intel_adsp_ace20_lnl.conf
index 804cbd036588..8ee046402e04 100644
--- a/app/boards/intel_adsp_ace20_lnl.conf
+++ b/app/boards/intel_adsp_ace20_lnl.conf
@@ -14,8 +14,10 @@ CONFIG_COMP_SRC_IPC4_FULL_MATRIX=y
CONFIG_PM=y
CONFIG_PM_DEVICE=y
CONFIG_PM_DEVICE_RUNTIME=y
+CONFIG_PM_DEVICE_RUNTIME_EXCLUSIVE=n
CONFIG_PM_DEVICE_POWER_DOMAIN=y
CONFIG_PM_POLICY_CUSTOM=y
+CONFIG_ADSP_IMR_CONTEXT_SAVE=y

@marc-hb
Copy link
Collaborator

marc-hb commented Jul 22, 2024

We may still have some CI secrets left (working on that) but thank God firmware configuration is not one of them

sof/scripts/xtensa-build-zephyr.py  lnl

zgrep IMR build-sof-staging/sof-info/lnl/config.gz

CONFIG_DT_HAS_INTEL_ADSP_IMR_ENABLED=y
CONFIG_ADSP_IMR_CONTEXT_SAVE=y

grep IMR build-lnl/zephyr/.config

CONFIG_DT_HAS_INTEL_ADSP_IMR_ENABLED=y
CONFIG_ADSP_IMR_CONTEXT_SAVE=y

@marc-hb
Copy link
Collaborator

marc-hb commented Jul 22, 2024

For MTL also some new issue observed

For “check-suspend-resume-with-playback-100” able to complete 100 cycles but at end failed with below error and TIMEOUT happens. It observed both SDW/HDA
2024-07-19 19:09:58 UTC [REMOTE_ERROR] firmware path not found from journalctl, no firmware loaded or debug option disabled?

I just submitted an sof-test fix that should help with that, please review:

@ranj063
Copy link
Collaborator

ranj063 commented Jul 22, 2024

I wonder what happens to the firmware state if it xruns before suspend and the context is saved. Not sure how well it can restart.... What would happens if we disable the context save @ranj063 ?

@plbossart we just ran the test with context_save disabled and still ran into the same underruns

opsiff pushed a commit to opsiff/UOS-kernel that referenced this issue Jul 29, 2024
[ Upstream commit 9065693dcc13f287b9e4991f43aee70cf5538fdd ]

When system enters suspend with an active stream, SOF core
calls hw_params_upon_resume(). On Intel platforms with HDA DMA used
to manage the link DMA, this leads to call chain of

   hda_dsp_set_hw_params_upon_resume()
 -> hda_dsp_dais_suspend()
 -> hda_dai_suspend()
 -> hda_ipc4_post_trigger()

A bug is hit in hda_dai_suspend() as hda_link_dma_cleanup() is run first,
which clears hext_stream->link_substream, and then hda_ipc4_post_trigger()
is called with a NULL snd_pcm_substream pointer.

Fixes: 2b009fa ("ASoC: SOF: Intel: hda: Unify DAI drv ops for IPC3 and IPC4")
Link: thesofproject/linux#5080
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://patch.msgid.link/20240704085708.371414-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 8246bbf818ed7b8d5afc92b951e6d562b45c2450)
bardliao pushed a commit to bardliao/linux that referenced this issue Jul 31, 2024
hda_dai_suspend() was added to handle paused stream during system
suspend. But as a side effect, it also ends up cleaning up the DMA
data for those streams that were prepared but not triggered before a
system suspend. Since these streams will not receive the prepare
callback after resuming, we need to preserve the DMA data during suspend.
So, add the check to handle only those streams that are in the paused
state to avoid losing the DMA data for all other streams.

Link: thesofproject#5080
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Fred Oh <fred.oh@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
bardliao pushed a commit to bardliao/linux that referenced this issue Aug 1, 2024
hda_dai_suspend() was added to handle paused stream during system
suspend. But as a side effect, it also ends up cleaning up the DMA
data for those streams that were prepared but not triggered before a
system suspend. Since these streams will not receive the prepare
callback after resuming, we need to preserve the DMA data during suspend.
So, add the check to handle only those streams that are in the paused
state to avoid losing the DMA data for all other streams.

Link: thesofproject#5080
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Fred Oh <fred.oh@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
bardliao pushed a commit that referenced this issue Aug 2, 2024
hda_dai_suspend() was added to handle paused stream during system
suspend. But as a side effect, it also ends up cleaning up the DMA
data for those streams that were prepared but not triggered before a
system suspend. Since these streams will not receive the prepare
callback after resuming, we need to preserve the DMA data during suspend.
So, add the check to handle only those streams that are in the paused
state to avoid losing the DMA data for all other streams.

Link: #5080
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Fred Oh <fred.oh@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
plbossart pushed a commit that referenced this issue Aug 5, 2024
hda_dai_suspend() was added to handle paused stream during system
suspend. But as a side effect, it also ends up cleaning up the DMA
data for those streams that were prepared but not triggered before a
system suspend. Since these streams will not receive the prepare
callback after resuming, we need to preserve the DMA data during suspend.
So, add the check to handle only those streams that are in the paused
state to avoid losing the DMA data for all other streams.

Link: #5080
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Fred Oh <fred.oh@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Dangku pushed a commit to Dangku/sunxi-linux that referenced this issue Aug 7, 2024
[ Upstream commit 9065693dcc13f287b9e4991f43aee70cf5538fdd ]

When system enters suspend with an active stream, SOF core
calls hw_params_upon_resume(). On Intel platforms with HDA DMA used
to manage the link DMA, this leads to call chain of

   hda_dsp_set_hw_params_upon_resume()
 -> hda_dsp_dais_suspend()
 -> hda_dai_suspend()
 -> hda_ipc4_post_trigger()

A bug is hit in hda_dai_suspend() as hda_link_dma_cleanup() is run first,
which clears hext_stream->link_substream, and then hda_ipc4_post_trigger()
is called with a NULL snd_pcm_substream pointer.

Fixes: 2b009fa ("ASoC: SOF: Intel: hda: Unify DAI drv ops for IPC3 and IPC4")
Link: thesofproject/linux#5080
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://patch.msgid.link/20240704085708.371414-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: August <2819763+Dangku@users.noreply.github.com>
Avenger-285714 pushed a commit to deepin-community/kernel that referenced this issue Aug 12, 2024
[ Upstream commit 9065693dcc13f287b9e4991f43aee70cf5538fdd ]

When system enters suspend with an active stream, SOF core
calls hw_params_upon_resume(). On Intel platforms with HDA DMA used
to manage the link DMA, this leads to call chain of

   hda_dsp_set_hw_params_upon_resume()
 -> hda_dsp_dais_suspend()
 -> hda_dai_suspend()
 -> hda_ipc4_post_trigger()

A bug is hit in hda_dai_suspend() as hda_link_dma_cleanup() is run first,
which clears hext_stream->link_substream, and then hda_ipc4_post_trigger()
is called with a NULL snd_pcm_substream pointer.

Fixes: 2b009fa ("ASoC: SOF: Intel: hda: Unify DAI drv ops for IPC3 and IPC4")
Link: thesofproject/linux#5080
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://patch.msgid.link/20240704085708.371414-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 8246bbf818ed7b8d5afc92b951e6d562b45c2450)
@kv2019i
Copy link
Collaborator

kv2019i commented Aug 19, 2024

@ssavati @ranj063 I was checking the daily results and it would seem this issue is not happening in the HDA configuration. Starting with daily of Aug 7th, there are a few fails to timeout (like test run 44787 ), but this is happening without any audio activity. There are no cases where we have failed xrun handling during last week on a HDA configuration.

@ranj063
Copy link
Collaborator

ranj063 commented Aug 20, 2024

@ssavati @ranj063 I was checking the daily results and it would seem this issue is not happening in the HDA configuration. Starting with daily of Aug 7th, there are a few fails to timeout (like test run 44787 ), but this is happening without any audio activity. There are no cases where we have failed xrun handling during last week on a HDA configuration.

@kv2019i we only run loops of 5 iterations in our daily tests and it is not long enough to reproduce the failure. @fredoh9 and I have confirmed that it is reproducible on HDA as well just a couple of days ago. Sometimes it takes about 200 iterations to hit the error but sometimes only a few

jhautbois pushed a commit to YoseliSAS/linux that referenced this issue Aug 21, 2024
When system enters suspend with an active stream, SOF core
calls hw_params_upon_resume(). On Intel platforms with HDA DMA used
to manage the link DMA, this leads to call chain of

   hda_dsp_set_hw_params_upon_resume()
 -> hda_dsp_dais_suspend()
 -> hda_dai_suspend()
 -> hda_ipc4_post_trigger()

A bug is hit in hda_dai_suspend() as hda_link_dma_cleanup() is run first,
which clears hext_stream->link_substream, and then hda_ipc4_post_trigger()
is called with a NULL snd_pcm_substream pointer.

Fixes: 2b009fa ("ASoC: SOF: Intel: hda: Unify DAI drv ops for IPC3 and IPC4")
Link: thesofproject#5080
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://patch.msgid.link/20240704085708.371414-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
bardliao pushed a commit to bardliao/linux that referenced this issue Aug 22, 2024
hda_dai_suspend() was added to handle paused stream during system
suspend. But as a side effect, it also ends up cleaning up the DMA
data for those streams that were prepared but not triggered before a
system suspend. Since these streams will not receive the prepare
callback after resuming, we need to preserve the DMA data during suspend.
So, add the check to handle only those streams that are in the paused
state to avoid losing the DMA data for all other streams.

Link: thesofproject#5080
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Fred Oh <fred.oh@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
bardliao pushed a commit to bardliao/linux that referenced this issue Aug 22, 2024
hda_dai_suspend() was added to handle paused stream during system
suspend. But as a side effect, it also ends up cleaning up the DMA
data for those streams that were prepared but not triggered before a
system suspend. Since these streams will not receive the prepare
callback after resuming, we need to preserve the DMA data during suspend.
So, add the check to handle only those streams that are in the paused
state to avoid losing the DMA data for all other streams.

Link: thesofproject#5080
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Fred Oh <fred.oh@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
@lgirdwood
Copy link
Member

Recoverable so now not a blocker.

@ssavati
Copy link
Author

ssavati commented Aug 26, 2024

On LNL HDA, I have run a suspended<->resume test for playback with Pipwire for a long iteration.

  • 1st run 703 iterations completed. On the playback console occasionally an XRUN message was observed but the playback stream continued without terminating play. Manually stopped test
  • 2nd run 445 iterations completed. On the playback console occasionally an XRUN message was observed but the playback stream continued without terminating play. After 445+ run system in hang state because system did not comeback to resume state and need force shutdown device

Kernel : 6.11.0-rc4-g65661c88571e
Attached dmesg and console log

pw_play_dmesg.txt
pw_play_console.txt
pw_suspend_resume.txt

Pipewire deafult sink during playback /* aplay -Dpipewire -r 48000 -c 2 -f S16_LE -d 10000 /dev/zero -v -q*/

$ pactl list short sinks | grep RUNNING
48 alsa_output.pci-0000_00_1f.3-platform-skl_hda_dsp_generic.HiFi__hw_sofhdadsp__sink PipeWire s32le 2ch 48000Hz RUNNING

cc: @lgirdwood @kv2019i @spkrishna

@marc-hb
Copy link
Collaborator

marc-hb commented Aug 26, 2024

After 445+ run system in hang state because system did not comeback to resume state and need force shutdown device

Re-adding previous label just to be on the safe side: apologies if I misunderstood something.

bardliao pushed a commit to bardliao/linux that referenced this issue Aug 30, 2024
hda_dai_suspend() was added to handle paused stream during system
suspend. But as a side effect, it also ends up cleaning up the DMA
data for those streams that were prepared but not triggered before a
system suspend. Since these streams will not receive the prepare
callback after resuming, we need to preserve the DMA data during suspend.
So, add the check to handle only those streams that are in the paused
state to avoid losing the DMA data for all other streams.

Link: thesofproject#5080
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Fred Oh <fred.oh@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
bardliao pushed a commit to bardliao/linux that referenced this issue Sep 10, 2024
hda_dai_suspend() was added to handle paused stream during system
suspend. But as a side effect, it also ends up cleaning up the DMA
data for those streams that were prepared but not triggered before a
system suspend. Since these streams will not receive the prepare
callback after resuming, we need to preserve the DMA data during suspend.
So, add the check to handle only those streams that are in the paused
state to avoid losing the DMA data for all other streams.

Link: thesofproject#5080
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Fred Oh <fred.oh@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
bardliao pushed a commit to bardliao/linux that referenced this issue Sep 10, 2024
hda_dai_suspend() was added to handle paused stream during system
suspend. But as a side effect, it also ends up cleaning up the DMA
data for those streams that were prepared but not triggered before a
system suspend. Since these streams will not receive the prepare
callback after resuming, we need to preserve the DMA data during suspend.
So, add the check to handle only those streams that are in the paused
state to avoid losing the DMA data for all other streams.

Link: thesofproject#5080
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Fred Oh <fred.oh@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
bardliao pushed a commit to bardliao/linux that referenced this issue Sep 12, 2024
hda_dai_suspend() was added to handle paused stream during system
suspend. But as a side effect, it also ends up cleaning up the DMA
data for those streams that were prepared but not triggered before a
system suspend. Since these streams will not receive the prepare
callback after resuming, we need to preserve the DMA data during suspend.
So, add the check to handle only those streams that are in the paused
state to avoid losing the DMA data for all other streams.

Link: thesofproject#5080
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Fred Oh <fred.oh@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
tuxedo-bot pushed a commit to tuxedocomputers/linux that referenced this issue Sep 13, 2024
BugLink: https://bugs.launchpad.net/bugs/2078289

[ Upstream commit 9065693 ]

When system enters suspend with an active stream, SOF core
calls hw_params_upon_resume(). On Intel platforms with HDA DMA used
to manage the link DMA, this leads to call chain of

   hda_dsp_set_hw_params_upon_resume()
 -> hda_dsp_dais_suspend()
 -> hda_dai_suspend()
 -> hda_ipc4_post_trigger()

A bug is hit in hda_dai_suspend() as hda_link_dma_cleanup() is run first,
which clears hext_stream->link_substream, and then hda_ipc4_post_trigger()
is called with a NULL snd_pcm_substream pointer.

Fixes: 2b009fa ("ASoC: SOF: Intel: hda: Unify DAI drv ops for IPC3 and IPC4")
Link: thesofproject/linux#5080
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://patch.msgid.link/20240704085708.371414-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Portia Stephens <portia.stephens@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
tuxedo-bot pushed a commit to tuxedocomputers/linux that referenced this issue Sep 27, 2024
BugLink: https://bugs.launchpad.net/bugs/2078289

[ Upstream commit 9065693 ]

When system enters suspend with an active stream, SOF core
calls hw_params_upon_resume(). On Intel platforms with HDA DMA used
to manage the link DMA, this leads to call chain of

   hda_dsp_set_hw_params_upon_resume()
 -> hda_dsp_dais_suspend()
 -> hda_dai_suspend()
 -> hda_ipc4_post_trigger()

A bug is hit in hda_dai_suspend() as hda_link_dma_cleanup() is run first,
which clears hext_stream->link_substream, and then hda_ipc4_post_trigger()
is called with a NULL snd_pcm_substream pointer.

Fixes: 2b009fa ("ASoC: SOF: Intel: hda: Unify DAI drv ops for IPC3 and IPC4")
Link: thesofproject/linux#5080
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://patch.msgid.link/20240704085708.371414-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Portia Stephens <portia.stephens@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working LNL Applies to Lunar Lake platform suspend resume Issues related to suspend resume (e.g. rtcwake)
Projects
None yet
Development

No branches or pull requests

7 participants