Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] [stable-v2.2] overrun in pause/resume tests on HDA configs #9139

Open
marc-hb opened this issue May 17, 2024 · 6 comments
Open

[BUG] [stable-v2.2] overrun in pause/resume tests on HDA configs #9139

marc-hb opened this issue May 17, 2024 · 6 comments
Labels
bug Something isn't working as expected P3 Low-impact bugs or features

Comments

@marc-hb
Copy link
Collaborator

marc-hb commented May 17, 2024

Describe the bug

ERROR hda_dma_link_check_xrun(): overrun detected when testing capture in pause-resume tests.

For some unknown reason, this is observed only on WHL, no other stable-v2.2 configuration seems affected.

EDIT: same on ADL and ADL-N refresh.

It seems PCM-specific?

To Reproduce

TPLG=/lib/firmware/intel/sof-tplg/sof-hda-generic-4ch.tplg MODEL=WHL_UPEXT_HDA-ipc3 SOF_TEST_INTERVAL=5 ~/sof-test/test-case/check-pause-resume.sh -c 100 -m capture

or

TPLG=/lib/firmware/intel/sof-tplg/sof-hda-generic-4ch.tplg MODEL=WHL_UPEXT_HDA-ipc3 SOF_TEST_INTERVAL=5 ~/sof-test/test-case/multiple-pause-resume.sh -r 25T

Reproduction Rate

100%. Observed in daily tests runs:

41199?model=WHL_UPEXT_HDA-ipc3&testcase=check-pause-resume-capture-100
41199?model=WHL_UPEXT_HDA-ipc3&testcase=multiple-pause-resume-25

41235?model=WHL_UPEXT_HDA-ipc3&testcase=check-pause-resume-capture-100
41235?model=WHL_UPEXT_HDA-ipc3&testcase=multiple-pause-resume-25

This is very likely a very old issue, exposed recently only thanks to:

Expected behavior

No ERROR in the firmware logs.

Environment

Linux Branch: topic/sof-dev
Linux Commit: 055cba4766af
KConfig Branch: master
KConfig Commit: 8fee06f8fd8a

SOF Build Result: 4027
SOF Branch: stable-v2.2
SOF Commit: 64416ba51354

Screenshots or console output

[      124404.578390] (          11.250000) c0 ll-schedule        ./schedule/ll_schedule.c:517  INFO task params pri 0 flags 0 start 0 period 1000
[      124421.713806] (          17.135416) c0 ll-schedule        ./schedule/ll_schedule.c:418  INFO new added task->start 1806829227 at 1806826150
[      124433.224222] (          11.510416) c0 ll-schedule        ./schedule/ll_schedule.c:421  INFO num_tasks 3 total_num_tasks 3
[      124626.609631] (         193.385406) c0 eq-iir       2.11  ../audio/eq_iir/eq_iir.c:788  INFO eq_iir_trigger()
[      124646.974214] (          20.364582) c0 eq-iir       2.11  ../audio/eq_iir/eq_iir.c:788  INFO eq_iir_trigger()
[      125877.755415] (        1230.781250) c0 hda-dma            ..../intel/hda/hda-dma.c:951  ERROR hda_dma_link_check_xrun(): overrun detected
[      228458.636755] (      102580.882812) c0 ll-schedule        ./schedule/ll_schedule.c:186  INFO task 0xbe1c0100 pipe-task  avg 1719, max 2082
[      338651.496960] (      110192.859375) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60060000
[      338667.069876] (          15.572916) c0 pipe         2.16  ....../pipeline-stream.c:270  INFO pipe trigger cmd 2
[      339624.153171] (         957.083313) c0 eq-iir       2.11  ../audio/eq_iir/eq_iir.c:788  INFO eq_iir_trigger()
[      339645.351087] (          21.197916) c0 ll-schedule        ./schedule/ll_schedule.c:159  INFO task complete 0xbe1c0100 pipe-task 
[      339656.496920] (          11.145833) c0 ll-schedule        ./schedule/ll_schedule.c:162  INFO num_tasks 2 total_num_tasks 2
[      339823.111497] (         166.614578) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x80010000
[      339839.673996] (          16.562500) c0 ipc                  src/ipc/ipc3/handler.c:595  INFO ipc: dai 3.1 -> config 
[      339851.913579] (          12.239583) c0 dai                           src/lib/dai.c:166  INFO dai_get type 3 index 1 new sref 2
[      339863.580245] (          11.666666) c0 hda-dai      3.1   /drivers/intel/hda/hda.c:42   INFO hda_set_config(): channels 2 rate 48000
[      339875.403161] (          11.822916) c0 dai                           src/lib/dai.c:190  INFO dai_put type 3 index 1 new sref 1
[      339887.226077] (          11.822916) c0 ipc                      src/ipc/ipc3/dai.c:185  INFO ipc_comp_dai_config() dai type = 3 index = 1
[      339900.090660] (          12.864583) c0 dai          2.15        src/ipc/ipc3/dai.c:279  INFO dai_config() dai type = 3 index = 1 dd 0x9e0fe100
[      459186.075504] (      119285.984375) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60070000
[      459201.648420] (          15.572916) c0 pipe         2.16  ....../pipeline-stream.c:270  INFO pipe trigger cmd 8
[      459221.544252] (          19.895832) c0 ll-schedule        ./schedule/ll_schedule.c:513  INFO task add 0xbe1c0100 pipe-task 
[      459233.106752] (          11.562500) c0 ll-schedule        ./schedule/ll_schedule.c:517  INFO task params pri 0 flags 0 start 0 period 1000
[      459250.398418] (          17.291666) c0 ll-schedule        ./schedule/ll_schedule.c:418  INFO new added task->start 1813261227 at 1813254861
[      459262.117167] (          11.718750) c0 ll-schedule        ./schedule/ll_schedule.c:421  INFO num_tasks 3 total_num_tasks 3
[      459626.804653] (         364.687500) c0 eq-iir       2.11  ../audio/eq_iir/eq_iir.c:788  INFO eq_iir_trigger()
[      459647.012985] (          20.208332) c0 eq-iir       2.11  ../audio/eq_iir/eq_iir.c:788  INFO eq_iir_trigger()
[      460877.742103] (        1230.729126) c0 hda-dma            ..../intel/hda/hda-dma.c:951  ERROR hda_dma_link_check_xrun(): overrun detected
[      672581.067024] (      211703.328125) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x60060000
[      672597.056607] (          15.989583) c0 pipe         2.16  ....../pipeline-stream.c:270  INFO pipe trigger cmd 2
[      673374.556576] (         777.499939) c0 eq-iir       2.11  ../audio/eq_iir/eq_iir.c:788  INFO eq_iir_trigger()
[      673395.962825] (          21.406250) c0 ll-schedule        ./schedule/ll_schedule.c:159  INFO task complete 0xbe1c0100 pipe-task 
[      673407.056575] (          11.093750) c0 ll-schedule        ./schedule/ll_schedule.c:162  INFO num_tasks 2 total_num_tasks 2
[      673711.587812] (         304.531250) c0 ipc                  src/ipc/ipc3/handler.c:1605 INFO ipc: new cmd 0x80010000
[      673728.410728] (          16.822916) c0 ipc                  src/ipc/ipc3/handler.c:595  INFO ipc: dai 3.1 -> config 
[      673740.702395] (          12.291666) c0 dai                           src/lib/dai.c:166  INFO dai_get type 3 index 1 new sref 2
[      673752.629477] (          11.927083) c0 hda-dai      3.1   /drivers/intel/hda/hda.c:42   INFO hda_set_config(): channels 2 rate 48000

cc:

@marc-hb marc-hb added the bug Something isn't working as expected label May 17, 2024
@marc-hb
Copy link
Collaborator Author

marc-hb commented May 20, 2024

See again in daily 41235 and 41277. 100% deterministic.

@abonislawski abonislawski added the P2 Critical bugs or normal features label May 21, 2024
@lgirdwood lgirdwood added P3 Low-impact bugs or features and removed P2 Critical bugs or normal features labels Jun 18, 2024
@ssavati ssavati changed the title [BUG] [WHL] [stable-v2.2] capture overrun in pause/resume tests [BUG] [stable-v2.2] capture overrun in pause/resume tests Jul 8, 2024
@ssavati
Copy link

ssavati commented Jul 8, 2024

@kv2019i this issue also obaserved on other platform like ADLP and ADL-N refresh

@marc-hb
Copy link
Collaborator Author

marc-hb commented Jul 26, 2024

With #9135, this is still one of the two failures hitting several of our models every day in our stable 2.2 IPC3 test plan. Like clockwork.

#9135 seems to affect only NOCODEC configurations. #9139 not just.

@marc-hb marc-hb changed the title [BUG] [stable-v2.2] capture overrun in pause/resume tests [BUG] [stable-v2.2] overrun in pause/resume tests on HDA configs Jul 26, 2024
@marc-hb
Copy link
Collaborator Author

marc-hb commented Jul 26, 2024

Two important observations thanks to @ranj063 's great help:

  • Overruns in the single pipeline test seem to always and only affect HDA analog capture
  • Overruns in the multiple pipeline test seem to always and only affect playback on HDMI2 and HDMI3 simultaneously.

Thanks to the new ADL-N refresh, we can now observe this consistently across two generations very far apart: WHL-HDA and ADL-N refresh HDA.

@ranj063
Copy link
Collaborator

ranj063 commented Jul 26, 2024

ERROR hda_dma_link_check_xrun(): overrun detected when testing capture in pause-resume tests.

I think it makes sense that all we see these prints in the case of HDMI/HDA Analog PCM's. I dont think the overruns reported by the firmware trace are detrimental though. The test goes on to complete all requested 100 iteration of pause/release. The script does the right think by calling out these errors to report failures too. IMHO, i think we should reduce the log level for the prints to be warnings instead or error.

@marc-hb
Copy link
Collaborator Author

marc-hb commented Jul 27, 2024

@ranj063 any idea why there is no overrun message when running the same test with Zephyr?

https://sof-ci.01.org/sofpr/PR9335/build6750/devicetest/index.html?model=TGLU_UP_HDA-ipc4&testcase=check-pause-resume-capture-100
I can only see the infamous "no bytes to copy" message there (#8761)

There are overruns messages (not ERRORs) but only when testing with multiple PCMs:
https://sof-ci.01.org/sofpr/PR9335/build6750/devicetest/index.html?model=TGLU_UP_HDA-ipc4&testcase=multiple-pause-resume-50

I don't know yet whether these overruns happen with the same HDMI ports: is there an easy way to tell from the firmware logs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected P3 Low-impact bugs or features
Projects
None yet
Development

No branches or pull requests

5 participants