[BUG] need to rate-limit 'nothing to copy' traces when mixer source is paused. #4672

plbossart · 2021-08-24T15:09:03Z

Describe the bug

When we use a mixer, pausing while using a single source will cause a lot of trace messages

[    16189998.887917] (        1237.760376) c0 dw-dma                 src/drivers/dw/dma.c:1069 INFO dw_dma_free_data_size() size is 0!
[    16190010.710833] (          11.822916) c0 dai          1.2            src/audio/dai.c:957  WARN dai_copy(): nothing to copy
[    16191249.200368] (        1238.489502) c0 dw-dma                 src/drivers/dw/dma.c:1069 INFO dw_dma_free_data_size() size is 0!
[    16191261.127450] (          11.927083) c0 dai          1.2            src/audio/dai.c:957  WARN dai_copy(): nothing to copy
[    16192498.991985] (        1237.864502) c0 dw-dma                 src/drivers/dw/dma.c:1069 INFO dw_dma_free_data_size() size is 0!
[    16192510.814901] (          11.822916) c0 dai          1.2            src/audio/dai.c:957  WARN dai_copy(): nothing to copy
[    16193749.200268] (        1238.385376) c0 dw-dma                 src/drivers/dw/dma.c:1069 INFO dw_dma_free_data_size() size is 0!

This is due to the fact that we don't pause a mixer but keep it active.

To Reproduce

use a topology with a mixer, e.g. cavs-nocodec
aplay -Dhw:0,0 -c2 -r48000 -fS16_LE /dev/zero -vv -i
press space bar
in another terminal, run sof-logger

Reproduction Rate
100%

Expected behavior
rate-limit the trace messages in that case. There's nothing to copy because the sources are paused.

Impact
annoyance

The text was updated successfully, but these errors were encountered:

lgirdwood · 2021-12-13T11:55:58Z

@lyakh @keyonjie did one of you work on a PR that updated this message to dbg level ? or a similar fix ?

marc-hb · 2023-06-09T00:26:36Z

Still happening with Zephyr
https://sof-ci.01.org/sofpr/PR7763/build9190/devicetest/index.html?model=TGLU_RVP_SDW_IPC4ZPH&testcase=multiple-pause-resume-
#7763

then this test failed with [ 74.224340] <err> dma_dw_common: xrun detected

[   74.211185] <err> dma_dw_common: xrun detected
[   74.211198] <wrn> dai_comp: comp:3 0x40003 dai_zephyr_copy(): nothing to copy
[   74.212185] <err> dma_dw_common: xrun detected
[   74.212220] <wrn> dai_comp: comp:3 0x40003 dai_zephyr_copy(): nothing to copy
[   74.213185] <err> dma_dw_common: xrun detected
[   74.213208] <wrn> dai_comp: comp:3 0x40003 dai_zephyr_copy(): nothing to copy
[   74.214185] <err> dma_dw_common: xrun detected
[   74.214198] <wrn> dai_comp: comp:3 0x40003 dai_zephyr_copy(): nothing to copy
[   74.215185] <err> dma_dw_common: xrun detected
[   74.215218] <wrn> dai_comp: comp:3 0x40003 dai_zephyr_copy(): nothing to copy
[   74.216186] <err> dma_dw_common: xrun detected
[   74.216208] <wrn> dai_comp: comp:3 0x40003 dai_zephyr_copy(): nothing to copy
[   74.217185] <err> dma_dw_common: xrun detected
[   74.217198] <wrn> dai_comp: comp:3 0x40003 dai_zephyr_copy(): nothing to copy
[   74.218185] <err> dma_dw_common: xrun detected
[   74.218220] <wrn> dai_comp: comp:3 0x40003 dai_zephyr_copy(): nothing to copy
[   74.219185] <err> dma_dw_common: xrun detected
[   74.219208] <wrn> dai_comp: comp:3 0x40003 dai_zephyr_copy(): nothing to copy
[   74.220185] <err> dma_dw_common: xrun detected
[   74.220198] <wrn> dai_comp: comp:3 0x40003 dai_zephyr_copy(): nothing to copy
[   74.221185] <err> dma_dw_common: xrun detected
[   74.221220] <wrn> dai_comp: comp:3 0x40003 dai_zephyr_copy(): nothing to copy
[   74.222185] <err> dma_dw_common: xrun detected
[   74.222208] <wrn> dai_comp: comp:3 0x40003 dai_zephyr_copy(): nothing to copy
[   74.223185] <err> dma_dw_common: xrun detected
[   74.223198] <wrn> dai_comp: comp:3 0x40003 dai_zephyr_copy(): nothing to copy
[   74.223251] <inf> ipc: rx	: 0x13000003|0x1
[   74.223286] <inf> pipe: pipe:2 0x0 pipe trigger cmd 2
[   74.223308] <inf> pipe: pipe:3 0x0 pipe trigger cmd 2
[   74.224253] <inf> ll_schedule: task complete 0xbe0b8bc0 0x20180U
[   74.224273] <inf> ll_schedule: num_tasks 2 total_num_tasks 2
[   74.224340] <err> dma_dw_common: xrun detected
[   74.224360] <inf> ll_schedule: task complete 0xbe0b94c0 0x20180U
[   74.224370] <inf> ll_schedule: num_tasks 1 total_num_tasks 1
[   74.224395] <inf> ll_schedule: zephyr_domain_unregister domain->type 1 domain->clk 4
[   74.230456] <inf> ipc: rx	: 0x13000002|0x1
[   74.230480] <inf> pipe: pipe:3 0x0 pipe trigger cmd 0
[   74.230646] <inf> pipe: pipe:2 0x0 pipe trigger cmd 0
[   74.230681] <wrn> copier: comp:2 0x40002 dai is not ready
[   74.231325] <inf> ipc: rx	: 0x46000002|0x3
[   74.231821] <inf> ipc: rx	: 0x12020000|0x0
[   74.231850] <inf> dma: dma_put(), dma = 0x9e093540, sref = 0
[   74.232636] <inf> ipc: rx	: 0x12030000|0x0
[   74.232870] <inf> dma: dma_put(), dma = 0x9e0934a0, sref = Terminated

lgirdwood · 2023-07-04T15:42:57Z

@kv2019i any API in Zephyr to rate limit logs ?

lyakh · 2023-07-05T06:33:51Z

@kv2019i any API in Zephyr to rate limit logs ?

@lgirdwood both SOF and Zephyr have means to drop logging entries in case of flooding, SOF also can merge "similar" messages in such situations but I don't think either has a special call to mark individual logs for rate-limiting. I was looking for that recently too and haven't found anything.

lgirdwood · 2023-07-05T12:27:19Z

@kv2019i any API in Zephyr to rate limit logs ?

@lgirdwood both SOF and Zephyr have means to drop logging entries in case of flooding, SOF also can merge "similar" messages in such situations but I don't think either has a special call to mark individual logs for rate-limiting. I was looking for that recently too and haven't found anything.

Need to use Zephyr methods, SOF logging will go away.

marc-hb · 2023-07-05T21:58:41Z

@kv2019i any API in Zephyr to rate limit logs ?

I had a look and I could not find anything either, just "drop newest" vs "drop oldest" when full.
https://docs.zephyrproject.org/latest/services/logging/index.html#log-message-allocation

I don't think either has a special call to mark individual logs for rate-limiting

I don't see when someone would want to rate-limit one specific log statement while letting others flood the logs. EDIT: you could want a higher threshold for some statements versus others, see example in #5597. Except this not really an example because it mentions only one statement, so there's a chance changing global thresholds is still enough even in 5997

The old sof-logger throttling is smart enough to automatically throttle only abusive log statements, leaving other logs unaffected. I'd expect something similar from Zephyr (if there was anything)

Need to use Zephyr methods, SOF logging will go away.

I don't see how one could affect the other.

marc-hb · 2023-12-12T23:07:58Z

Tentative fix submitted, buried in monster PR #8571

andyross · 2023-12-12T23:20:36Z

My changes were predicated on the DP scheduler, I didn't realize there were other circumstances that tripped over this too. So they won't fix this in particular, even though it's the same warning. Maybe we should just remove the warning entirely if it's known benign in other situations? Buffer over/underflow is already an error condition, I don't see why this is needed?

(DP hits this due to asynchrony in the pipeline updates: the DP component sinks data synchronously in the source pipeline, but the output doesn't show up right away, so things downstream yell that they have nothing to do until the next time the pipeline is scheduled. But... that's precisely the design of the DP scheduler.)

The DAI emits a flood of warnings when presented with empty buffers at copy time. That's not really a reasonable warning condition. There are multiple situations where components upstream may be paused or asynchronous, leading to starvation in any given cycle. Earlier code has hit this with paused components, where the log messages are merely annoying. One new situation is that when using the DP scheduler, updates are async and may happen at a different cadence than the pipeline the DAI is on; the upstream component will be presented with data in a (for example) 1ms pipeline tick, but then send it to a different component (echo cancellation, say) that batches it up into larger buffers (10ms) and releases it downstream only at the slower cadence. In that situation the flood of messages is being emitted during an active stream, and tends to cause glitches all by itself after a few seconds (and even where it doesn't, it floods the Zephyr log backend to the extent that literally every message is dropped). (I don't know that all such warnigns are removed by this patch. These are only the ones I've seen in practice.) Fixes thesofproject#4672 Signed-off-by: Andy Ross <andyross@google.com>

Trying to reproduce DSP panic thesofproject#8621 ------- The DAI emits a flood of warnings when presented with empty buffers at copy time. That's not really a reasonable warning condition. There are multiple situations where components upstream may be paused or asynchronous, leading to starvation in any given cycle. Earlier code has hit this with paused components, where the log messages are merely annoying. One new situation is that when using the DP scheduler, updates are async and may happen at a different cadence than the pipeline the DAI is on; the upstream component will be presented with data in a (for example) 1ms pipeline tick, but then send it to a different component (echo cancellation, say) that batches it up into larger buffers (10ms) and releases it downstream only at the slower cadence. In that situation the flood of messages is being emitted during an active stream, and tends to cause glitches all by itself after a few seconds (and even where it doesn't, it floods the Zephyr log backend to the extent that literally every message is dropped). (I don't know that all such warnigns are removed by this patch. These are only the ones I've seen in practice.) Fixes thesofproject#4672 Signed-off-by: Andy Ross <andyross@google.com> (cherry picked from commit 514576e)

kv2019i · 2023-12-15T11:18:25Z

I think this is a bug. If the mixer itself is active, it should continue to produce data to the DAI if the DAI is running. I'll bump priority to P2 and assign to v2.9.

marcinszkudlinski · 2023-12-15T16:04:00Z

I'm still not comfortable with silencing those messages, they have been very useful for me recently

do what do you think about this sollution:

as cycles with no data may naturally happen at pipeline startup, so silence them.
BUT - once a first data portion arrives, any further "no data" cycle means a glitch.

introduce a status flag - "no data seen yet", set it at startup and clear at first cycle with data
send not a warning but an error log in case of "no data" cycle after startup (when no data seen yet is cleared and there's no data)

lgirdwood · 2023-12-19T17:20:00Z

@kv2019i your fix now merged, good to close ?

kv2019i · 2023-12-20T17:56:01Z

@lgirdwood I think what @plbossart described here originally is actually a bug, but when I tried to reproduce this today, I could not. Even without my recent fix PR to (8649), pausing PCMs with mixer caused no flood of "nothing to copy" messages. So unless @plbossart disagrees, I think this can be closed. The original functional problem has been solved, and also "nothing to copy" messages are now limited to debug builds only.

plbossart · 2023-12-20T18:18:06Z

I am not sure what to make of all the comments.

This is a 2.5yr old bug, I am not even sure it this was reported in the context of IPC4.

On one side we have our trusted @kv2019i who cannot reproduce the error, and on the other side no consensus with a remark from @marcinszkudlinski and a reference to Google AEC which cannot possibly be related.

I would err on the side of keeping this open until there's consensus.

kv2019i · 2024-01-12T09:47:34Z

@marcinszkudlinski Are you good to close this now? We have kept the warning logs in code, but they are behind a build option (#8649). I think in any case, all known problems (that people can reproduce have been fixed), so I don't want to keep this open for v2.9 if it's not clear anything is broken or needs to be done.

marcinszkudlinski · 2024-01-15T07:19:45Z

for me - ok to close

marc-hb · 2024-01-18T01:46:10Z

The mtrace spam still makes logs unusable.

Try opening
https://sof-ci.01.org/sofpr/PR8754/build1979/devicetest/index.html?model=MTLP_RVP_NOCODEC&testcase=multiple-pause-resume-50 ,then click on the "mtrace" tab. The logs are so big that it takes at least 10-15 seconds to just open that tab. Probably longer if you don't have a high speed connection.

[ 2782.210465] <inf> pipe: pipeline_trigger: pipe:2 0x0 pipe trigger cmd 2
[ 2782.211160] <inf> ll_schedule: zephyr_ll_task_done: task complete 0xa0117cc0 0x20210U
[ 2782.211178] <inf> ll_schedule: zephyr_ll_task_done: num_tasks 1 total_num_tasks 3
[ 2782.211186] <inf> ll_schedule: zephyr_domain_unregister: zephyr_domain_unregister domain->type 1 domain->clk 3
[ 2782.221346] <inf> ipc: ipc_cmd: rx	: 0x13000004|0x1
[ 2782.221371] <inf> pipe: pipeline_trigger: pipe:2 0x0 pipe trigger cmd 8
[ 2782.221380] <inf> ll_schedule: zephyr_ll_task_schedule_common: task add 0xa0117cc0 0x20210U priority 0 flags 0x0
[ 2782.221398] <inf> ll_schedule: zephyr_domain_register: zephyr_domain_register domain->type 1 domain->clk 3 domain->ticks_per_ms 38400 period 1000
[ 2782.222170] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.223161] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.224160] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.225161] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.226160] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.227160] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.228160] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.229160] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.230160] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.231158] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.231463] <inf> pipe: pipeline_trigger: pipe:1 0x0 pipe trigger cmd 8
[ 2782.231488] <inf> ll_schedule: zephyr_ll_task_schedule_common: task add 0xa0118dc0 0x20210U priority 0 flags 0x0
[ 2782.232158] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.233161] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.234161] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.235161] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.236160] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.237160] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.238160] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.239161] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.240160] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.241158] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.241546] <inf> pipe: pipeline_trigger: pipe:0 0x0 pipe trigger cmd 8
[ 2782.241571] <inf> ll_schedule: zephyr_ll_task_schedule_common: task add 0xa0119240 0x20210U priority 0 flags 0x0
[ 2782.242231] <inf> host_comp: host_get_copy_bytes_normal: comp:2 0x30004 no bytes to copy, available samples: 0, free_samples: 768
[ 2782.242280] <inf> dma_intel_adsp_gpdma: intel_adsp_gpdma_power_off: intel_adsp_gpdma_power_off: dma dma@7c000 power off
[ 2782.242320] <inf> dma_intel_adsp_gpdma: intel_adsp_gpdma_power_on: intel_adsp_gpdma_power_on: dma dma@7c000 initialized
[ 2782.242363] <inf> dai_intel_dmic: dai_dmic_update_bits: dai_dmic_update_bits base 10000, reg 0, mask 6000000, value 4000000
[ 2782.242370] <inf> dai_intel_dmic: dai_dmic_update_bits: dai_dmic_update_bits base 10000, reg 1000, mask 10000, value 0

kv2019i · 2024-01-18T08:29:45Z

@marc-hb Please open a different bug for this. This is a log coming from different component and in different case, so we are reusing same bug id for multiple issues here. These logs in latest comment are expected, but we can debate the volume.

marc-hb · 2024-01-18T22:50:39Z

Opened new #8761

plbossart added the bug Something isn't working as expected label Aug 24, 2021

bkokoszx self-assigned this Aug 31, 2021

lgirdwood added this to the v2.1 milestone Feb 9, 2022

lgirdwood added the P3 Low-impact bugs or features label Feb 9, 2022

kv2019i removed this from the v2.1 milestone Feb 15, 2023

fredoh9 mentioned this issue Jun 14, 2023

[BUG] multiple-pause-resume.sh failed input/output error, mtrace has many xrun in ADLP_SKU0B00_SDCA #7804

Closed

lgirdwood added this to the TBD milestone Jul 4, 2023

This was referenced Jul 5, 2023

Catch firmware errors thesofproject/sof-test#1075

Merged

[FEATURE] support dictionary formats with "mtrace" thesofproject/sof-test#1046

Open

[FEATURE] Disable trace suppression for select messages #5597

Open

marc-hb added the area:SOF logging label Jul 5, 2023

marc-hb unassigned bkokoszx Jul 5, 2023

marc-hb mentioned this issue Dec 12, 2023

Google AEC rework: unbreak/resurrect, mt8195, DP integration #8571

Closed

andyross mentioned this issue Dec 13, 2023

dai_zephyr: Silence benign warnings #8621

Closed

kv2019i added P2 Critical bugs or normal features and removed P3 Low-impact bugs or features labels Dec 15, 2023

kv2019i modified the milestones: TBD, v2.9 Dec 15, 2023

kv2019i mentioned this issue Dec 19, 2023

audio: dai-zephyr: put no-data checks behind a build option #8649

Merged

kv2019i closed this as completed Jan 15, 2024

marc-hb reopened this Jan 18, 2024

marc-hb mentioned this issue Jan 18, 2024

host_get_copy_bytes_normal: "no bytes to copy" log spam #8761

Closed

marc-hb closed this as completed Jan 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] need to rate-limit 'nothing to copy' traces when mixer source is paused. #4672

[BUG] need to rate-limit 'nothing to copy' traces when mixer source is paused. #4672

plbossart commented Aug 24, 2021

lgirdwood commented Dec 13, 2021

marc-hb commented Jun 9, 2023 •

edited

Loading

lgirdwood commented Jul 4, 2023

lyakh commented Jul 5, 2023

lgirdwood commented Jul 5, 2023

marc-hb commented Jul 5, 2023 •

edited

Loading

marc-hb commented Dec 12, 2023

andyross commented Dec 12, 2023

kv2019i commented Dec 15, 2023

marcinszkudlinski commented Dec 15, 2023 •

edited

Loading

lgirdwood commented Dec 19, 2023

kv2019i commented Dec 20, 2023

plbossart commented Dec 20, 2023

kv2019i commented Jan 12, 2024

marcinszkudlinski commented Jan 15, 2024

marc-hb commented Jan 18, 2024

kv2019i commented Jan 18, 2024

marc-hb commented Jan 18, 2024

[BUG] need to rate-limit 'nothing to copy' traces when mixer source is paused. #4672

[BUG] need to rate-limit 'nothing to copy' traces when mixer source is paused. #4672

Comments

plbossart commented Aug 24, 2021

lgirdwood commented Dec 13, 2021

marc-hb commented Jun 9, 2023 • edited Loading

lgirdwood commented Jul 4, 2023

lyakh commented Jul 5, 2023

lgirdwood commented Jul 5, 2023

marc-hb commented Jul 5, 2023 • edited Loading

marc-hb commented Dec 12, 2023

andyross commented Dec 12, 2023

kv2019i commented Dec 15, 2023

marcinszkudlinski commented Dec 15, 2023 • edited Loading

lgirdwood commented Dec 19, 2023

kv2019i commented Dec 20, 2023

plbossart commented Dec 20, 2023

kv2019i commented Jan 12, 2024

marcinszkudlinski commented Jan 15, 2024

marc-hb commented Jan 18, 2024

kv2019i commented Jan 18, 2024

marc-hb commented Jan 18, 2024

marc-hb commented Jun 9, 2023 •

edited

Loading

marc-hb commented Jul 5, 2023 •

edited

Loading

marcinszkudlinski commented Dec 15, 2023 •

edited

Loading