Skip to content

Commit 1512231

Browse files
kkdwivediAlexei Starovoitov
authored andcommitted
bpf: Enforce RCU protection for KF_RCU_PROTECTED
Currently, KF_RCU_PROTECTED only applies to iterator APIs and that too in a convoluted fashion: the presence of this flag on the kfunc is used to set MEM_RCU in iterator type, and the lack of RCU protection results in an error only later, once next() or destroy() methods are invoked on the iterator. While there is no bug, this is certainly a bit unintuitive, and makes the enforcement of the flag iterator specific. In the interest of making this flag useful for other upcoming kfuncs, e.g. scx_bpf_cpu_curr() [0][1], add enforcement for invoking the kfunc in an RCU critical section in general. This would also mean that iterator APIs using KF_RCU_PROTECTED will error out earlier, instead of throwing an error for lack of RCU CS protection when next() or destroy() methods are invoked. In addition to this, if the kfuncs tagged KF_RCU_PROTECTED return a pointer value, ensure that this pointer value is only usable in an RCU critical section. There might be edge cases where the return value is special and doesn't need to imply MEM_RCU semantics, but in general, the assumption should hold for the majority of kfuncs, and we can revisit things if necessary later. [0]: https://lore.kernel.org/all/20250903212311.369697-3-christian.loehle@arm.com [1]: https://lore.kernel.org/all/20250909195709.92669-1-arighi@nvidia.com Tested-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250917032755.4068726-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
1 parent 6ff4a0f commit 1512231

File tree

4 files changed

+31
-4
lines changed

4 files changed

+31
-4
lines changed

Documentation/bpf/kfuncs.rst

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -335,9 +335,26 @@ consider doing refcnt != 0 check, especially when returning a KF_ACQUIRE
335335
pointer. Note as well that a KF_ACQUIRE kfunc that is KF_RCU should very likely
336336
also be KF_RET_NULL.
337337

338+
2.4.8 KF_RCU_PROTECTED flag
339+
---------------------------
340+
341+
The KF_RCU_PROTECTED flag is used to indicate that the kfunc must be invoked in
342+
an RCU critical section. This is assumed by default in non-sleepable programs,
343+
and must be explicitly ensured by calling ``bpf_rcu_read_lock`` for sleepable
344+
ones.
345+
346+
If the kfunc returns a pointer value, this flag also enforces that the returned
347+
pointer is RCU protected, and can only be used while the RCU critical section is
348+
active.
349+
350+
The flag is distinct from the ``KF_RCU`` flag, which only ensures that its
351+
arguments are at least RCU protected pointers. This may transitively imply that
352+
RCU protection is ensured, but it does not work in cases of kfuncs which require
353+
RCU protection but do not take RCU protected arguments.
354+
338355
.. _KF_deprecated_flag:
339356

340-
2.4.8 KF_DEPRECATED flag
357+
2.4.9 KF_DEPRECATED flag
341358
------------------------
342359

343360
The KF_DEPRECATED flag is used for kfuncs which are scheduled to be

kernel/bpf/verifier.c

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13931,6 +13931,11 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
1393113931
return -EACCES;
1393213932
}
1393313933

13934+
if (is_kfunc_rcu_protected(&meta) && !in_rcu_cs(env)) {
13935+
verbose(env, "kernel func %s requires RCU critical section protection\n", func_name);
13936+
return -EACCES;
13937+
}
13938+
1393413939
/* In case of release function, we get register number of refcounted
1393513940
* PTR_TO_BTF_ID in bpf_kfunc_arg_meta, do the release now.
1393613941
*/
@@ -14044,6 +14049,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
1404414049
/* Ensures we don't access the memory after a release_reference() */
1404514050
if (meta.ref_obj_id)
1404614051
regs[BPF_REG_0].ref_obj_id = meta.ref_obj_id;
14052+
14053+
if (is_kfunc_rcu_protected(&meta))
14054+
regs[BPF_REG_0].type |= MEM_RCU;
1404714055
} else {
1404814056
mark_reg_known_zero(env, regs, BPF_REG_0);
1404914057
regs[BPF_REG_0].btf = desc_btf;
@@ -14052,6 +14060,8 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
1405214060

1405314061
if (meta.func_id == special_kfunc_list[KF_bpf_get_kmem_cache])
1405414062
regs[BPF_REG_0].type |= PTR_UNTRUSTED;
14063+
else if (is_kfunc_rcu_protected(&meta))
14064+
regs[BPF_REG_0].type |= MEM_RCU;
1405514065

1405614066
if (is_iter_next_kfunc(&meta)) {
1405714067
struct bpf_reg_state *cur_iter;

tools/testing/selftests/bpf/progs/cgroup_read_xattr.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ int BPF_PROG(use_css_iter_non_sleepable)
7373
}
7474

7575
SEC("lsm.s/socket_connect")
76-
__failure __msg("expected an RCU CS")
76+
__failure __msg("kernel func bpf_iter_css_new requires RCU critical section protection")
7777
int BPF_PROG(use_css_iter_sleepable_missing_rcu_lock)
7878
{
7979
u64 cgrp_id = bpf_get_current_cgroup_id();

tools/testing/selftests/bpf/progs/iters_task_failure.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ void bpf_rcu_read_lock(void) __ksym;
1515
void bpf_rcu_read_unlock(void) __ksym;
1616

1717
SEC("?fentry.s/" SYS_PREFIX "sys_getpgid")
18-
__failure __msg("expected an RCU CS when using bpf_iter_task_next")
18+
__failure __msg("kernel func bpf_iter_task_new requires RCU critical section protection")
1919
int BPF_PROG(iter_tasks_without_lock)
2020
{
2121
struct task_struct *pos;
@@ -27,7 +27,7 @@ int BPF_PROG(iter_tasks_without_lock)
2727
}
2828

2929
SEC("?fentry.s/" SYS_PREFIX "sys_getpgid")
30-
__failure __msg("expected an RCU CS when using bpf_iter_css_next")
30+
__failure __msg("kernel func bpf_iter_css_new requires RCU critical section protection")
3131
int BPF_PROG(iter_css_without_lock)
3232
{
3333
u64 cg_id = bpf_get_current_cgroup_id();

0 commit comments

Comments
 (0)