Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] TGLU_RVP_NOCODEC IPC timed out for PIPE_FREE when multiple-pipeline-all #5103

Closed
miRoox opened this issue Dec 17, 2021 · 7 comments
Closed
Labels
bug Something isn't working as expected dynamic pipeline Issues related to the dynamic pipeline feature Intel Linux Daily tests This issue can be found in internal Linux daily tests IPC timeout IPC timeout observed multicore Issues observed when not only core#0 is used. P2 Critical bugs or normal features TGL Applies to Tiger Lake

Comments

@miRoox
Copy link

miRoox commented Dec 17, 2021

Describe the bug
In inner test ID 8954, model=TGLU_RVP_NOCODEC, testcase=multiple-pipeline-all-50,
When running all pipelines in parallel on TGLU_RVP_NOCODEC, IPC timed out for PIPE_FREE.

To Reproduce

TPLG=/lib/firmware/intel/sof-tplg/sof-tgl-nocodec.tplg MODEL=TGLU_RVP_NOCODEC ~/sof-test/test-case/multiple-pipeline.sh -f p -c 20 -l 150

Reproduction Rate
Around 1/150 in manual reproduction.

Environment
Kernel Branch: topic/sof-dev
Kernel Commit: thesofproject/linux@9e5edeba
SOF Branch: main
SOF Commit: afac44af5f49
Topology:/lib/firmware/intel/sof-tplg/sof-tgl-nocodec.tplg
Platform: TGLU_RVP_NOCODEC

Screenshots or console output
dmesg

[ 6372.843906] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: widget BUF11.1 freed
[ 6372.843917] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x30110000: GLB_TPLG_MSG: PIPE_FREE
[ 6373.338444] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 6373.338494] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc rx done: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 6373.348350] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx timed out for 0x30110000 (msg/reply size: 12/12)
[ 6373.348368] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: preventing DSP entering D3 state to preserve context
[ 6373.348373] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ IPC dump start ]------------
[ 6373.348401] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: hda irq intsts 0x00000000 intlctl 0xc0000081 rirb 00
[ 6373.348408] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: dsp irq ppsts 0x00000000 adspis 0x00000000
[ 6373.348434] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: error: host status 0x00000000 dsp status 0x00000000 mask 0x00000003
[ 6373.348440] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ IPC dump end ]------------
[ 6373.348445] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump start ]------------
[ 6373.348450] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: Firmware exception
[ 6373.348456] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: fw_state: SOF_FW_BOOT_COMPLETE (6)
[ 6373.348480] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: status: fw entered - code 00000005
[ 6373.348607] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: unexpected fault 0x00000000 trace 0x00004000
[ 6373.348614] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump end ]------------
[ 6373.348628] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: error: failed to free widget PIPELINE.11.SSP0.OUT
[ 6373.348646] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_hw_free on 0000:00:1f.3: -110
[ 6373.348679] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm: close stream 3 dir 0
[ 6373.348686] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: error: trace IO error
[ 6373.348786] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm: close stream 0 dir 1
[ 6373.348932] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm: free stream 1 dir 0
[ 6373.348943] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm: trigger stream 0 dir 0 cmd 0
[ 6373.348968] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x60050000: GLB_STREAM_MSG: TRIG_STOP
[ 6373.838361] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 6373.838389] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc rx done: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 6373.853194] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx timed out for 0x60050000 (msg/reply size: 12/12)
[ 6373.853202] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: preventing DSP entering D3 state to preserve context
[ 6373.853204] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ IPC dump start ]------------
[ 6373.853228] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: hda irq intsts 0x00000000 intlctl 0xc0000081 rirb 00
[ 6373.853231] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: dsp irq ppsts 0x00000000 adspis 0x00000000
[ 6373.853253] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: error: host status 0x00000000 dsp status 0x00000000 mask 0x00000003
[ 6373.853256] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ IPC dump end ]------------
[ 6373.853258] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump start ]------------
[ 6373.853260] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: Firmware exception
[ 6373.853266] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: fw_state: SOF_FW_BOOT_COMPLETE (6)
[ 6373.853287] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: status: fw entered - code 00000005
[ 6373.853382] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: unexpected fault 0x00000000 trace 0x00004000
[ 6373.853384] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump end ]------------
[ 6373.853433] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x30020000: GLB_TPLG_MSG: COMP_FREE
[ 6373.853949] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: FW Poll Status: reg[0x160]=0x20140000 successful
[ 6373.853955] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ASoC: error at soc_component_trigger on 0000:00:1f.3: -110
[ 6373.853958] kernel:  Port0: ASoC: trigger FE cmd: 0 failed: -110
@miRoox miRoox added bug Something isn't working as expected TGL Applies to Tiger Lake Intel Linux Daily tests This issue can be found in internal Linux daily tests labels Dec 17, 2021
@mengdonglin mengdonglin added IPC timeout IPC timeout observed dynamic pipeline Issues related to the dynamic pipeline feature multicore Issues observed when not only core#0 is used. labels Dec 21, 2021
@lgirdwood
Copy link
Member

@miRoox is there any stack trace with the FW exception (kernel may need to display stack trace).
@ujfalusi should stack trace be enabled by default in dmesg ?

@ujfalusi
Copy link
Contributor

@lgirdwood, in case of IPC timeout we don't have stack trace available. The stack can be only shown if we have firmware panic or assert.
The dtrace or etrace from debugfs might provide some hint on why we have timeout.

@lgirdwood
Copy link
Member

@ujfalusi should we see it here as the log says we have a FW exception ?

[ 6373.348445] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump start ]------------
[ 6373.348450] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: Firmware exception
[ 6373.348456] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: fw_state: SOF_FW_BOOT_COMPLETE (6)
[ 6373.348480] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: status: fw entered - code 00000005
[ 6373.348607] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: unexpected fault 0x00000000 trace 0x00004000
[ 6373.348614] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump end ]------------

@ujfalusi
Copy link
Contributor

@ujfalusi should we see it here as the log says we have a FW exception ?

[ 6373.348445] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump start ]------------
[ 6373.348450] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: Firmware exception
[ 6373.348456] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: fw_state: SOF_FW_BOOT_COMPLETE (6)
[ 6373.348480] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: status: fw entered - code 00000005
[ 6373.348607] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: unexpected fault 0x00000000 trace 0x00004000
[ 6373.348614] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump end ]------------

This is printed from snd_sof_handle_fw_exception() which is only called from ipc.c when we have TX timeout. I don't know why it is printing Firmware exception, but we have the timeout pointed out before.

I guess the idea is that this could be called from different places (it is not) and be a generic "exception" printer. The problem is that if we don't have a panic then we don't have callstack. The callstack is only prepared and given when the firmware panics/asserts.

I'll try to come up with a better expression for this and re-word it. The IPC timeout might or might not be fatal. In case it is fatal we are going to have a panic as well, which will print the stack.

"Firmware timeout" is a good candidate ;)

@kv2019i
Copy link
Collaborator

kv2019i commented Apr 1, 2022

This error is potentially caused by unsafe BUFFER_FREE. Fix for this in in review at #5635

@lgirdwood
Copy link
Member

This error is potentially caused by unsafe BUFFER_FREE. Fix for this in in review at #5635

Now merged, can you retest. Thanks !

@miRoox
Copy link
Author

miRoox commented Apr 18, 2022

This issue no longer recurs.

@miRoox miRoox closed this as completed Apr 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected dynamic pipeline Issues related to the dynamic pipeline feature Intel Linux Daily tests This issue can be found in internal Linux daily tests IPC timeout IPC timeout observed multicore Issues observed when not only core#0 is used. P2 Critical bugs or normal features TGL Applies to Tiger Lake
Projects
None yet
Development

No branches or pull requests

6 participants