Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "ipc4: relax the IPC timeout checks and be nicer to other thre… #8756

Merged
merged 1 commit into from
Jan 25, 2024

Conversation

RanderWang
Copy link
Collaborator

…ads"

When a stream is triggered to start, host kernel first sendis trigger start ipc message to fw and then start host dma for this stream. Ipc_wait_for_compound_msg is used to wait for all pipelines in the stream to be complete and need to be done fast since it blocks host to start hda dma. The reverted commit adds a 10 ms delay and results to host copier xrun warning for it can't get data from host dma. And this commit also makes a negative effect for stream_start_offset calculation, so revert it.

This reverts commit 909a327.

one fix for thesofproject/linux#4781

@RanderWang
Copy link
Collaborator Author

fw warning with reverted commit:

[ 4580.800270] <inf> copier: copier_comp_trigger: comp:0 0x4 copier_comp_trigger() cmd 7
[ 4580.800293] <inf> copier: copier_comp_trigger: comp:0 0x4 copier_comp_trigger() cmd 1
[ 4580.800328] <err> copier: copier_comp_trigger: comp:0 0x4 ------- dai pos 4288, dai link 527
[ 4580.800350] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.801273] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.802271] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.803271] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.804271] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.805271] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.806273] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.807271] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.808271] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.809253] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.810258] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.811248] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0,

…ads"

When a stream is triggered to start, host kernel first sendis trigger
start ipc message to fw and then start host dma for this stream.
Ipc_wait_for_compound_msg is used to wait for all pipelines in the
stream to be complete and need to be done fast since it blocks host
to start hda dma. The reverted commit adds a 10 ms delay and results
to host copier xrun warning for it can't get data from host dma.

   And this commit also makes a negative effect for stream_start_offset
calculation which is for the difference between dai gateway and
host gateway. We calculate it in host start function but host dma is
started 10ms later, so there will be at least 10ms data error.

This reverts commit 909a327.

Signed-off-by: Rander Wang <rander.wang@intel.com>
Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @RanderWang , this can help potentially in other cases as well. I don't think this was the original intent of the change, but this will delay any IPC that requires to be handled in pipeline context, by 10ms.

@kv2019i
Copy link
Collaborator

kv2019i commented Jan 18, 2024

Note, the original patch has been referred to fix #7774 so we need to track test results carefully.

@RanderWang
Copy link
Collaborator Author

Note, the original patch has been referred to fix #7774 so we need to track test results carefully.

As I was working on the issue #7774 at that time, I tested the original patch but made no effect.

@RanderWang
Copy link
Collaborator Author

#7774 should be fixed by #8143


while (atomic_read(&msg_data.delayed_reply)) {
k_sleep(Z_TIMEOUT_MS(10));
k_sleep(Z_TIMEOUT_US(250));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btian1

250 --> 10ms? it is 40 times enlarge? may miss something during sleep? how about 1ms?

above is the comments I added when the change first submitted, my question is: how 250us * 30 = 7.5ms comes from?

is it based on experience value to make total wait time < 10ms?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is set by experiments with test and log. Most waiting time is less than 1ms. I don't want to use 1ms since it will miss one schedule period in FW.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What context is this called from ? My expectation is the IPC thread and if so not sure how we can block LL from running ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgirdwood This is the IPC thread. It doesn't block the LL thread, but it will block processing of IPC messages until the LL timer fires next time, runs the IPC processing (that needs to happen in LL context), acks it as done, and then the IPC thread can continue. Now this adds a mandatory 10ms delay to each such handover from IPC thread to LL thread.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so we have a timing dependency between LL thread and IPC thread and this is convergence. This need to be spelled out in the inline comments alongside setting a better tries count as 30 means about 7.5 LL ticks. i.e. we need to say > 1.5 LL ticks would be a significant failure

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i , @RanderWang is it better if 150 * 50? to make IPC response more quick?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgirdwood This was the mechanism added by @lyakh to synchronize pipeline trigger processing with LL thread (so that pipeline state changes are not done async with the pipeline exection). @lgirdwood @btian1 We can improve the logic (or even devise a different way to wait for next ll tick, I think the original concern of trying to avoid unnecessary thread context switches is as valid as ever), but can we do that as a follow-up and merge this plain revert first? This helps with multiple pressing issues (AV sync and this seems to help with multicore scenarios), and we have done significant testing of this version with @RanderWang . If we want a modified new mechanism, that needs to be tested well, pushing the merge out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with a revert, but we need a folloup PR that spells out the relationships with some inline commentary.

@kv2019i
Copy link
Collaborator

kv2019i commented Jan 25, 2024

One fail in https://sof-ci.01.org/sofpr/PR8756/build1997/devicetest/index.html?model=MTLP_RVP_NOCODEC&testcase=multiple-pause-resume-50 -- no FW/drv errors in the logs. This looks like a case of #8770 .

I'll proceed wit merge

@kv2019i kv2019i merged commit 08cf3d2 into thesofproject:main Jan 25, 2024
38 of 44 checks passed
@marc-hb
Copy link
Collaborator

marc-hb commented Jan 27, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants