Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] FATAL EXCEPTION| CPU 0 EXCCAUSE 9 (load/store alignment) #7831

Closed
keqiaozhang opened this issue Jun 20, 2023 · 7 comments
Closed

[BUG] FATAL EXCEPTION| CPU 0 EXCCAUSE 9 (load/store alignment) #7831

keqiaozhang opened this issue Jun 20, 2023 · 7 comments
Labels
bug Something isn't working as expected IPC4 Issues observed with IPC4 (same IPC as Windows) P2 Critical bugs or normal features TGL Applies to Tiger Lake

Comments

@keqiaozhang
Copy link
Collaborator

keqiaozhang commented Jun 20, 2023

Describe the bug
Observed this issue in CI daily test. This issue happened when testing DMIC capture on TGL-NOCODEC-IPC4 platform.
Hard to reproduce this issue manually and no reproductions so far.

dmesg

[ 2875.268135] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx done : 0x13000003|0x1: GLB_SET_PIPELINE_STATE [data size: 16]
[ 2875.268138] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx      : 0x13000004|0x1: GLB_SET_PIPELINE_STATE [data size: 16]
[ 2875.274793] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc rx      : 0x1b0a0000|0x0: GLB_NOTIFICATION|EXCEPTION_CAUGHT
[ 2875.274796] kernel: snd_sof:sof_ipc4_rx_msg: sof-audio-pci-intel-tgl 0000:00:1f.3: Unhandled DSP message: 0x1b0a0000|0x0
[ 2875.274798] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc rx done : 0x1b0a0000|0x0: GLB_NOTIFICATION|EXCEPTION_CAUGHT
[ 2875.770703] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc timed out for 0x13000004|0x1
[ 2875.770719] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: Attempting to prevent DSP from entering D3 state to preserve context
[ 2875.770726] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ IPC dump start ]------------
[ 2875.770757] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: hda irq intsts 0x00000000 intlctl 0xc0000000 rirb 00
[ 2875.770765] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: dsp irq ppsts 0x00000000 adspis 0x00000000
[ 2875.770807] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: Host IPC initiator: 0x93000004|0x1|0x0, target: 0x0|0x0|0x80000000, ctl: 0x3
[ 2875.770815] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ IPC dump end ]------------
[ 2875.770820] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump start ]------------
[ 2875.770825] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: IPC timeout
[ 2875.770831] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: fw_state: SOF_FW_BOOT_COMPLETE (7)
[ 2875.770851] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: 0x00000005: module: ROM, state: FW_ENTERED, running
[ 2875.770879] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: extended rom status:  0x5 0x0 0x0 0x0 0x0 0x0 0x0 0x1
[ 2875.770884] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump end ]------------
[ 2875.770937] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: failed to set final state 4 for all pipelines
[ 2875.770957] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ASoC: error at soc_component_trigger on 0000:00:1f.3: -110
[ 2875.770974] kernel:  DMIC SFX2: ASoC: trigger FE cmd: 1 failed: -110

** mtrace**

[   57.026686] <inf> dai_intel_dmic: dai_dmic_start: dmic_start(), cic 0x0000c001
[   57.026696] <inf> dai_intel_dmic: dai_dmic_update_bits: dai_dmic_update_bits base 10000, reg 2000, mask 10000, value 0
[   57.026706] <inf> dai_intel_dmic: dai_dmic_start: dmic_start(), cic 0x0000c001
[   57.026720] <err> os: xtensa_excint1_c:  ** FATAL EXCEPTION
[   57.026730] <err> os: xtensa_excint1_c:  ** CPU 0 EXCCAUSE 9 (load/store alignment)
[   57.026740] <err> os: xtensa_excint1_c:  **  PC 0xbe01d73f VADDR 0x8000000a
[   57.026748] <err> os: xtensa_excint1_c:  **  PS 0x60520
[   57.026761] <err> os: xtensa_excint1_c:  **    (INTLEVEL:0 EXCM: 0 UM:1 RING:0 WOE:1 OWB:5 CALLINC:2)
[   57.026773] <err> os: z_xtensa_dump_stack:  **  A0 0xbe01a003  SP 0xbe0ac590  A2 0xbe089568  A3 0x80000006
[   57.026783] <err> os: z_xtensa_dump_stack:  **  A4 0xf  A5 0x9e09a554  A6 0xbe05bf70  A7 (nil)
[   57.026793] <err> os: z_xtensa_dump_stack:  **  A8 0x1  A9 0x1 A1Terminated

To Reproduce
~/sof-test/test-case/check-capture.sh -d 1 -l 1 -r 50

Reproduction Rate
Hard to reproduce this issue manually and no reproductions so far.

Environment

  1. Branch name and commit hash of the 2 repositories: sof (firmware/topology) and linux (kernel driver).
  2. Name of the topology file
    • Topology: {avs-tplg/sof-tgl-nocodec.tplg}
  3. Name of the platform(s) on which the bug is observed.
    • Platform: {TGLU_RVP_NOCODEC_IPC4ZPH}

Screenshots or console output

2023-06-19 22:19:40 UTC [REMOTE_COMMAND] arecord   -Dhw:0,28 -r 48000 -c 4 -f S32_LE -d 1 /dev/null -v -q
Hardware PCM card 0 'sof-nocodec' device 28 subdevice 0
Its setup is:
  stream       : CAPTURE
  access       : RW_INTERLEAVED
  format       : S32_LE
  subformat    : STD
  channels     : 4
  rate         : 48000
  exact rate   : 48000 (48000/1)
  msbits       : 32
  buffer_size  : 4096
  period_size  : 1024
  period_time  : 21333
  tstamp_mode  : NONE
  tstamp_type  : MONOTONIC
  period_step  : 1
  avail_min    : 1024
  period_event : 0
  start_threshold  : 1
  stop_threshold   : 4096
  silence_threshold: 0
  silence_size : 0
  boundary     : 4611686018427387904
  appl_ptr     : 0
  hw_ptr       : 0
arecord: pcm_read:2221: read error: Connection timed out
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/120/gvfs
      Output information may be incomplete.

dmesg.txt

mtrace.txt

@keqiaozhang keqiaozhang added bug Something isn't working as expected TGL Applies to Tiger Lake IPC4 Issues observed with IPC4 (same IPC as Windows) P2 Critical bugs or normal features labels Jun 20, 2023
@kv2019i
Copy link
Collaborator

kv2019i commented Jun 20, 2023

I think we hit this before while debugging #7191 -> analysis here #7191 (comment)

@juimonen
Copy link

@keqiaozhang we should probably try testing with zephyrproject-rtos/zephyr#59416

@kv2019i
Copy link
Collaborator

kv2019i commented Jun 20, 2023

Yup, tested zephyrproject-rtos/zephyr#59416 and I think this fixes a clear bug in dmic irq handler.

@fredoh9
Copy link
Contributor

fredoh9 commented Jun 26, 2023

Today daily test also found this issue. TGLU_RVP_NOCODEC_IPC4ZPH. Although mtrace doesn't have the exception printed.

Intel internal daily test: planresultdetail/28195?model=TGLU_RVP_NOCODEC_IPC4ZPH&testcase=check-capture-100times

Reproduction Rate:
yes, hard to reproduce manually with same build SHA1. But I can see at least once in a week.

@fredoh9
Copy link
Contributor

fredoh9 commented Jun 26, 2023

#7857 , west.xml PR to have zephyrproject-rtos/zephyr#59416 merged after today's daily build. Will closely track the test results

@fredoh9
Copy link
Contributor

fredoh9 commented Jun 27, 2023

This issue is not found today's daily test. Will monitor couple more tests

@fredoh9
Copy link
Contributor

fredoh9 commented Jun 29, 2023

looks good so far. I think we can close this now. We can re-open this if we find again

@fredoh9 fredoh9 closed this as completed Jun 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected IPC4 Issues observed with IPC4 (same IPC as Windows) P2 Critical bugs or normal features TGL Applies to Tiger Lake
Projects
None yet
Development

No branches or pull requests

4 participants