Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG][Multi-core] ipc timed out for 0x13000004|0x1GLB_SET_PIPELINE_STATE when testing non-core 0 pipeline on MTL-NOCODEC #7649

Closed
keqiaozhang opened this issue May 19, 2023 · 9 comments
Assignees
Labels
bug Something isn't working as expected I2S Applies to I2S bus for codec connection IPC timeout IPC timeout observed MTL Applies to Meteor Lake platform mtl-005 multicore Issues observed when not only core#0 is used. multicore-3cores Issues observed when 3 cores are used. P1 Blocker bugs or important features

Comments

@keqiaozhang
Copy link
Collaborator

keqiaozhang commented May 19, 2023

Describe the bug
Testing mtl-005-stable-drop branch on MTL-NOCODEC platform with multi-core tplg, but all non-core0 pipelines failed the test with IPC times out. The reproduce rate is 100%.

DMESG

[  898.191889] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx      : 0x13000003|0x1: GLB_SET_PIPELINE_STATE [data size: 12]
[  898.192296] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx reply: 0x33000000|0x1: GLB_SET_PIPELINE_STATE
[  898.192326] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx done : 0x13000003|0x1: GLB_SET_PIPELINE_STATE [data size: 12]
[  898.192333] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx      : 0x13000004|0x1: GLB_SET_PIPELINE_STATE [data size: 12]
[  898.700002] sof-audio-pci-intel-mtl 0000:00:1f.3: ipc timed out for 0x13000004|0x1
[  898.700032] sof-audio-pci-intel-mtl 0000:00:1f.3: Attempting to prevent DSP from entering D3 state to preserve context
[  898.700041] sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ IPC dump start ]------------
[  898.700059] sof-audio-pci-intel-mtl 0000:00:1f.3: Host IPC initiator: 0x93000004|0x1|0x0, target: 0x13000004|0x1|0x0, ctl: 0x3
[  898.700070] sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ IPC dump end ]------------
[  898.700078] sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ DSP dump start ]------------
[  898.700084] sof-audio-pci-intel-mtl 0000:00:1f.3: IPC timeout
[  898.700091] sof-audio-pci-intel-mtl 0000:00:1f.3: fw_state: SOF_FW_BOOT_COMPLETE (7)
[  898.700107] sof-audio-pci-intel-mtl 0000:00:1f.3: ROM status: 0x5, ROM error: 0x0
[  898.700113] sof-audio-pci-intel-mtl 0000:00:1f.3: ROM debug status: 0x50000005, ROM debug error: 0x0
[  898.700123] sof-audio-pci-intel-mtl 0000:00:1f.3: ROM feature bit enabled
[  898.700128] sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ DSP dump end ]------------
[  898.700158] sof-audio-pci-intel-mtl 0000:00:1f.3: failed to set final state 4 for all pipelines
[  898.700168] sof-audio-pci-intel-mtl 0000:00:1f.3: ASoC: error at soc_component_trigger on 0000:00:1f.3: -110
[  898.700181]  Port2: ASoC: trigger FE cmd: 1 failed: -110

TPLG
image

To Reproduce
$ aplay -Dhw:0,1 -r 48000 -c 2 -f S16_LE -d 3 /dev/zero -v -q
or
$ aplay -Dhw:0,2 -r 48000 -c 2 -f S16_LE -d 3 /dev/zero -v -q

Reproduction Rate
100%

Environment

  1. Branch name and commit hash of the 2 repositories: sof (firmware/topology) and linux (kernel driver).
  2. Name of the topology file
  3. Name of the platform(s) on which the bug is observed.
    • Platform: MTL-RVP-NOCODEC

dmesg.txt
mtrace.txt

@keqiaozhang keqiaozhang added bug Something isn't working as expected P1 Blocker bugs or important features I2S Applies to I2S bus for codec connection multicore Issues observed when not only core#0 is used. MTL Applies to Meteor Lake platform mtl-005 labels May 19, 2023
@mengdonglin mengdonglin added the IPC timeout IPC timeout observed label May 19, 2023
@abonislawski
Copy link
Member

Multi-core fix added to 005, it should cover case when several pipelines used on a single secondary core.
Please verify if it helps here or in #7650

@keqiaozhang
Copy link
Collaborator Author

keqiaozhang commented May 19, 2023

This issue is still reproducible on latest mtl-005/ddc2ef6. But it won't fail immediately and the errors are also changed.

[  228.014879] kernel: snd_sof:sof_ipc4_prepare_copier_module: sof-audio-pci-intel-mtl 0000:00:1f.3: copier copier.SSP.4.1, IPC size is 188
[  228.014883] kernel: snd_sof:sof_ipc4_update_resource_usage: sof-audio-pci-intel-mtl 0000:00:1f.3: copier.SSP.4.1: ibs / obs / cpc: 384 / 384 / 6388
[  228.014891] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx      : 0x47000000|0x0: MOD_SET_DX [data size: 8]
[  228.517095] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc timed out for 0x47000000|0x0
[  228.517140] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Attempting to prevent DSP from entering D3 state to preserve context
[  228.517155] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ IPC dump start ]------------
[  228.517218] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Host IPC initiator: 0x47000000|0x0|0x0, target: 0x67000000|0x0|0x0, ctl: 0x3
[  228.517238] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ IPC dump end ]------------
[  228.517250] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ DSP dump start ]------------
[  228.517261] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: IPC timeout
[  228.517269] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: fw_state: SOF_FW_BOOT_COMPLETE (7)
[  228.517288] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ROM status: 0x0, ROM error: 0x0
[  228.517294] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ROM debug status: 0x50000005, ROM debug error: 0x0
[  228.517304] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ROM feature bit enabled
[  228.517309] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ DSP dump end ]------------
[  228.517369] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: error: failed to enable target core for widget pipeline.3
[  228.517381] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Failed to set up connected widgets
[  228.517402] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: error: failed widget list set up for pcm 1 dir 0
[  228.517410] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_hw_params on 0000:00:1f.3: -110
[  228.517446] kernel:  Port1: ASoC: error at __soc_pcm_hw_params on Port1: -110
[  228.517462] kernel:  Port1: ASoC: error at dpcm_fe_dai_hw_params on Port1: -110
[  228.517478] kernel: snd_sof:sof_pcm_hw_free: sof-audio-pci-intel-mtl 0000:00:1f.3: pcm: free stream 1 dir 0
[  228.518201] kernel: snd_sof:sof_pcm_close: sof-audio-pci-intel-mtl 0000:00:1f.3: pcm: close stream 1 dir 0

@abonislawski
Copy link
Member

This is not the same issue, just another one

@abonislawski
Copy link
Member

Probably now we hit this one
#7108 (comment)

@keqiaozhang
Copy link
Collaborator Author

I checked this issue again today and found that this issue only happened at the first/second round of test after system reboot.

But this issue cannot be reproduced after reloading kmod,
Test steps:

1. rebooting the device
2. check aplay on non-core 0 pipelines. it fails at first or second round of test.
3. then run kmod reload
    $ sof-test/tools/kmod/sof_remove.sh
    $ sof-test/tools/kmod/sof_insert.sh
4. run test again on non-core0 pipelines, it will always pass the test. 

@abonislawski abonislawski self-assigned this May 22, 2023
@abonislawski
Copy link
Member

abonislawski commented May 30, 2023

Verified today with latest 005 version.
Original problem is fixed along with another one in 7108.

Right now it can hit another issue, aplay -Dhw:0,1 is working fine one by one but (as @keqiaozhang advised) we can add delay and break it.
That's because driver will put dsp to D3 state in such case:

[ +0.001031] snd_sof_intel_hda_common:hda_dsp_state_log: sof-audio-pci-intel-mtl 0000:00:1f.3: Current DSP power state: D3
[ +0.000004] snd_sof:sof_set_fw_state: sof-audio-pci-intel-mtl 0000:00:1f.3: fw_state change: 7 -> 0

and restore when we start another aplay, core 1 will be powered up successfully but hit timeout on the first IPC (create pipeline).

I think this is already third issue in this thread so it would be great to close this and open another one for D3 restore scenario.

@mengdonglin FYI: IMR context save is enabled only on 005, thats why it can behave differently on main branch.
6c57b64

@aborisovich
Copy link
Contributor

aborisovich commented Jun 18, 2023

Reproduced the issue, checked sof main branch with my 2 modified PRs:
#7325
zephyrproject-rtos/zephyr#56168
Unfortunately this does not work and fails very differently (on system boot multiple IPC timeout error on pipeline creation).
dmesg (2).txt
Cherry-picked those PRs along with PM fix zephyrproject-rtos/zephyr@5ea27ec to mtl-0050-drop-stable .
This brought no improvement over originally reported issue... (used aplay -Dhw:0,2 -r 48000 -c 2 -f S16_LE -d 3 /dev/zero -v -q)

[  149.967451] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx reply: 0x67000000|0x0: MOD_SET_DX
[  149.967678] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx done : 0x47000000|0x0: MOD_SET_DX [data size: 8]
[  149.967703] snd_sof:snd_sof_dsp_core_get: sof-audio-pci-intel-mtl 0000:00:1f.3: Core 2 powered up
[  149.967709] snd_sof:sof_ipc4_widget_setup: sof-audio-pci-intel-mtl 0000:00:1f.3: pipeline: 5 memory pages: 5
[  149.967715] snd_sof:sof_ipc4_widget_setup: sof-audio-pci-intel-mtl 0000:00:1f.3: Create widget pipeline.5 instance 0 - pipe 5 - core 2
[  149.967721] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx      : 0x11000007|0x200000: GLB_CREATE_PIPELINE
[  150.473650] sof-audio-pci-intel-mtl 0000:00:1f.3: ipc timed out for 0x11000007|0x200000
[  150.473671] sof-audio-pci-intel-mtl 0000:00:1f.3: Attempting to prevent DSP from entering D3 state to preserve context
[  150.473677] sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ IPC dump start ]------------
[  150.473695] sof-audio-pci-intel-mtl 0000:00:1f.3: Host IPC initiator: 0x91000007|0x200000|0x0, target: 0x1b060000|0x0|0x0, ctl: 0x3
[  150.473702] sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ IPC dump end ]------------
[  150.473707] sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ DSP dump start ]------------
[  150.473711] sof-audio-pci-intel-mtl 0000:00:1f.3: IPC timeout
[  150.473716] sof-audio-pci-intel-mtl 0000:00:1f.3: fw_state: SOF_FW_BOOT_COMPLETE (7)
[  150.473729] sof-audio-pci-intel-mtl 0000:00:1f.3: ROM status: 0x0, ROM error: 0x0
[  150.473733] sof-audio-pci-intel-mtl 0000:00:1f.3: ROM debug status: 0x50000005, ROM debug error: 0x0
[  150.473748] sof-audio-pci-intel-mtl 0000:00:1f.3: ROM feature bit enabled
[  150.473752] sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ DSP dump end ]------------
[  150.473784] sof-audio-pci-intel-mtl 0000:00:1f.3: failed to create module pipeline.5
[  150.473800] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx      : 0x47000000|0x0: MOD_SET_DX [data size: 8]
[  150.977783] sof-audio-pci-intel-mtl 0000:00:1f.3: ipc timed out for 0x47000000|0x0
[  150.977806] sof-audio-pci-intel-mtl 0000:00:1f.3: IPC timeout
[  150.977838] sof-audio-pci-intel-mtl 0000:00:1f.3: Failed to set up connected widgets
[  150.977854] sof-audio-pci-intel-mtl 0000:00:1f.3: error: failed widget list set up for pcm 2 dir 0
[  150.977862] sof-audio-pci-intel-mtl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_hw_params on 0000:00:1f.3: -110
[  150.977900]  Port2: ASoC: error at __soc_pcm_hw_params on Port2: -110
[  150.977915]  Port2: ASoC: error at dpcm_fe_dai_hw_params on Port2: -110
[  150.977927] snd_sof:sof_pcm_hw_free: sof-audio-pci-intel-mtl 0000:00:1f.3: pcm: free stream 2 dir 0
[  150.978739] snd_sof:sof_pcm_close: sof-audio-pci-intel-mtl 0000:00:1f.3: pcm: close stream 2 dir 0
[  150.979222] aplay (1756) used greatest stack depth: 11936 bytes left

@mengdonglin mengdonglin added the multicore-3cores Issues observed when 3 cores are used. label Jun 21, 2023
@serhiy-katsyuba-intel
Copy link
Contributor

The root cause of the issue is that IDC interrupt was not re-enabled after exiting D3.
Fix PR for mtl-005-drop-stable branch:
thesofproject/zephyr#68

@keqiaozhang
Copy link
Collaborator Author

This issue has been fixed on latest mtl-005 branch, closing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected I2S Applies to I2S bus for codec connection IPC timeout IPC timeout observed MTL Applies to Meteor Lake platform mtl-005 multicore Issues observed when not only core#0 is used. multicore-3cores Issues observed when 3 cores are used. P1 Blocker bugs or important features
Projects
None yet
Development

No branches or pull requests

5 participants