-
Notifications
You must be signed in to change notification settings - Fork 139
Audio DSP power gating and clock gating enable #247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Disable/Enable audio dsp clock and power gating before/after firmware boot resp. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Set ops for audio dsp clock and power gating for SKL+ platforms. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
| /* PCI registers */ | ||
| #define PCI_TCSEL 0x44 | ||
| #define PCI_CGCTL 0x48 | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to correct naming of these 2 registers:
#define PCI_PGCTL 0x44
#define PCI_CGCTL 0x48
keyonjie
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you cleanup naming for PGCTL and CGCTL? and pay attention to difference of gating enable/disable bits.
|
|
||
| /* PCI_CGCTL bits */ | ||
| #define PCI_CGCTL_MISCBDCGE_MASK BIT(6) | ||
| #define PCI_CGCTL_LSRMD_MASK BIT(4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this bit should be PGCTL one, rename and move it to group /* PCI_PGCTL bits */
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, that will avoid confusion!
|
|
||
| /* PCI_TVSEL bits */ | ||
| #define PCI_TCSEL_ADSPPGD BIT(2) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename this to PCI_PGCTL_ADSPPGD
| u32 val; | ||
|
|
||
| /* Update PDCGE bit of CGCTL register */ | ||
| val = enable ? PCI_CGCTL_ADSPDCGE : 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is opposite for PCI_TCSEL_ADSPPGD?
Be careful that some bits are gating disable(e.g. ADSPPGD), some others are gating enable(e.g. PCI_CGCTL_ADSPDCGE), we need to handle them carefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keyonjie yes, but the code does the right thing here. When we want to enable clock and power gating we write 1 to PCI_CGCTL_ADSPDCGE in CGCTL and write 0 to PCI_PGCTL BIT(2).
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets make this a little more generic so we can handle more than just clocks.
| int (*resume)(struct snd_sof_dev *sof_dev); | ||
| int (*runtime_suspend)(struct snd_sof_dev *sof_dev, int state); | ||
| int (*runtime_resume)(struct snd_sof_dev *sof_dev); | ||
| void (*clock_power_gating)(struct snd_sof_dev *sof_dev, bool enable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should name this a little more generic so we can handle more than just clocks. It does look like here we are entering an idle state ? If so snd_sof_dsp_idle(sdev, bool idle) . Please also don't name the variable enable here, best to use the verb name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lgirdwood we are enabling clock gating and power for the audio DSP here i.e. the HW can power gate when it is idle. So calling it idle would be misleading as typically in the context of power management, idle would indicate we are decrementing the ref count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranj063 is the DSP HW in D3 state here when we gate the clocks ?If so, shouldn't this be part of suspend/resume calls at the HW abstraction level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranj063 I am also not clear on the DSP state. If you do power gating why do you need clock gating? Isn't it more when you are in D0 (or D0ix) that you allow the clocks to be gated to allow parts of the hardware to save power?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranj063 is the DSP HW in D3 state here when we gate the clocks ?If so, shouldn't this be part of suspend/resume calls at the HW abstraction level.
@lgirdwood the HW is not in D3 here. And we not performing power gating or clock gating here either. We are just enabling them meaning that we are indicating to the HW that if the audio dsp is idle, it should be clocked gated and power gated. When the audio device is idle and in D3, these bits will enable the HW to power gate/clock gate.
Before we boot the firmware, we disable power gating during dsp probe. So if we dont re-enable it, I think we will never be able to power gate the audio dsp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranj063 I am also not clear on the DSP state. If you do power gating why do you need clock gating? Isn't it more when you are in D0 (or D0ix) that you allow the clocks to be gated to allow parts of the hardware to save power?
@plbossart we are just enabling the HW to be able to do clock gating and power gating here. The dsp is not in D3 after these bits are set. But when an opportunity arises the HW will do the needful if these bits are set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lgirdwood @plbossart btw i didnt invent anything new here. I just copied it from the skylake driver. We seem to have missed this sequence earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranj063 ok, so setting these bits allows the HW to opportunistically gate clocks/PM resources when the FW is idle after boot is complete ? If so, should probably be part of the existing HW abstraction ops for fw ready and boot ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lgirdwood thats correct. I think it makes sense to make it part of the "run" ops. Let me make the change.
|
closing this PR now. Will open a new one with more comprehensive PM updates. |
|
link to #249 |
Use Option::Map
'./test_progs -t test_local_storage' reported a splat: [ 27.137569] ============================= [ 27.138122] [ BUG: Invalid wait context ] [ 27.138650] 6.5.0-03980-gd11ae1b16b0a thesofproject#247 Tainted: G O [ 27.139542] ----------------------------- [ 27.140106] test_progs/1729 is trying to lock: [ 27.140713] ffff8883ef047b88 (stock_lock){-.-.}-{3:3}, at: local_lock_acquire+0x9/0x130 [ 27.141834] other info that might help us debug this: [ 27.142437] context-{5:5} [ 27.142856] 2 locks held by test_progs/1729: [ 27.143352] #0: ffffffff84bcd9c0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x40 [ 27.144492] #1: ffff888107deb2c0 (&storage->lock){..-.}-{2:2}, at: bpf_local_storage_update+0x39e/0x8e0 [ 27.145855] stack backtrace: [ 27.146274] CPU: 0 PID: 1729 Comm: test_progs Tainted: G O 6.5.0-03980-gd11ae1b16b0a thesofproject#247 [ 27.147550] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 27.149127] Call Trace: [ 27.149490] <TASK> [ 27.149867] dump_stack_lvl+0x130/0x1d0 [ 27.152609] dump_stack+0x14/0x20 [ 27.153131] __lock_acquire+0x1657/0x2220 [ 27.153677] lock_acquire+0x1b8/0x510 [ 27.157908] local_lock_acquire+0x29/0x130 [ 27.159048] obj_cgroup_charge+0xf4/0x3c0 [ 27.160794] slab_pre_alloc_hook+0x28e/0x2b0 [ 27.161931] __kmem_cache_alloc_node+0x51/0x210 [ 27.163557] __kmalloc+0xaa/0x210 [ 27.164593] bpf_map_kzalloc+0xbc/0x170 [ 27.165147] bpf_selem_alloc+0x130/0x510 [ 27.166295] bpf_local_storage_update+0x5aa/0x8e0 [ 27.167042] bpf_fd_sk_storage_update_elem+0xdb/0x1a0 [ 27.169199] bpf_map_update_value+0x415/0x4f0 [ 27.169871] map_update_elem+0x413/0x550 [ 27.170330] __sys_bpf+0x5e9/0x640 [ 27.174065] __x64_sys_bpf+0x80/0x90 [ 27.174568] do_syscall_64+0x48/0xa0 [ 27.175201] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 27.175932] RIP: 0033:0x7effb40e41ad [ 27.176357] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d8 [ 27.179028] RSP: 002b:00007ffe64c21fc8 EFLAGS: 00000202 ORIG_RAX: 0000000000000141 [ 27.180088] RAX: ffffffffffffffda RBX: 00007ffe64c22768 RCX: 00007effb40e41ad [ 27.181082] RDX: 0000000000000020 RSI: 00007ffe64c22008 RDI: 0000000000000002 [ 27.182030] RBP: 00007ffe64c21ff0 R08: 0000000000000000 R09: 00007ffe64c22788 [ 27.183038] R10: 0000000000000064 R11: 0000000000000202 R12: 0000000000000000 [ 27.184006] R13: 00007ffe64c22788 R14: 00007effb42a1000 R15: 0000000000000000 [ 27.184958] </TASK> It complains about acquiring a local_lock while holding a raw_spin_lock. It means it should not allocate memory while holding a raw_spin_lock since it is not safe for RT. raw_spin_lock is needed because bpf_local_storage supports tracing context. In particular for task local storage, it is easy to get a "current" task PTR_TO_BTF_ID in tracing bpf prog. However, task (and cgroup) local storage has already been moved to bpf mem allocator which can be used after raw_spin_lock. The splat is for the sk storage. For sk (and inode) storage, it has not been moved to bpf mem allocator. Using raw_spin_lock or not, kzalloc(GFP_ATOMIC) could theoretically be unsafe in tracing context. However, the local storage helper requires a verifier accepted sk pointer (PTR_TO_BTF_ID), it is hypothetical if that (mean running a bpf prog in a kzalloc unsafe context and also able to hold a verifier accepted sk pointer) could happen. This patch avoids kzalloc after raw_spin_lock to silent the splat. There is an existing kzalloc before the raw_spin_lock. At that point, a kzalloc is very likely required because a lookup has just been done before. Thus, this patch always does the kzalloc before acquiring the raw_spin_lock and remove the later kzalloc usage after the raw_spin_lock. After this change, it will have a charge and then uncharge during the syscall bpf_map_update_elem() code path. This patch opts for simplicity and not continue the old optimization to save one charge and uncharge. This issue is dated back to the very first commit of bpf_sk_storage which had been refactored multiple times to create task, inode, and cgroup storage. This patch uses a Fixes tag with a more recent commit that should be easier to do backport. Fixes: b00fa38 ("bpf: Enable non-atomic allocations in local storage") Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230901231129.578493-2-martin.lau@linux.dev
Enable clock gating and power gating after fw boot