Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
Daniel Borkmann says:

====================
pull-request: bpf-next 2022-05-23

We've added 113 non-merge commits during the last 26 day(s) which contain
a total of 121 files changed, 7425 insertions(+), 1586 deletions(-).

The main changes are:

1) Speed up symbol resolution for kprobes multi-link attachments, from Jiri Olsa.

2) Add BPF dynamic pointer infrastructure e.g. to allow for dynamically sized ringbuf
   reservations without extra memory copies, from Joanne Koong.

3) Big batch of libbpf improvements towards libbpf 1.0 release, from Andrii Nakryiko.

4) Add BPF link iterator to traverse links via seq_file ops, from Dmitrii Dolgov.

5) Add source IP address to BPF tunnel key infrastructure, from Kaixi Fan.

6) Refine unprivileged BPF to disable only object-creating commands, from Alan Maguire.

7) Fix JIT blinding of ld_imm64 when they point to subprogs, from Alexei Starovoitov.

8) Add BPF access to mptcp_sock structures and their meta data, from Geliang Tang.

9) Add new BPF helper for access to remote CPU's BPF map elements, from Feng Zhou.

10) Allow attaching 64-bit cookie to BPF link of fentry/fexit/fmod_ret, from Kui-Feng Lee.

11) Follow-ups to typed pointer support in BPF maps, from Kumar Kartikeya Dwivedi.

12) Add busy-poll test cases to the XSK selftest suite, from Magnus Karlsson.

13) Improvements in BPF selftest test_progs subtest output, from Mykola Lysenko.

14) Fill bpf_prog_pack allocator areas with illegal instructions, from Song Liu.

15) Add generic batch operations for BPF map-in-map cases, from Takshak Chahande.

16) Make bpf_jit_enable more user friendly when permanently on 1, from Tiezhu Yang.

17) Fix an array overflow in bpf_trampoline_get_progs(), from Yuntao Wang.

====================

Link: https://lore.kernel.org/r/20220523223805.27931-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
  • Loading branch information
kuba-moo committed May 23, 2022
2 parents 9fa87dd + 608b638 commit 1ef0736
Show file tree
Hide file tree
Showing 121 changed files with 7,422 additions and 1,582 deletions.
4 changes: 2 additions & 2 deletions Documentation/bpf/instruction-set.rst
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ Byte swap instructions
The byte swap instructions use an instruction class of ``BFP_ALU`` and a 4-bit
code field of ``BPF_END``.

The byte swap instructions instructions operate on the destination register
The byte swap instructions operate on the destination register
only and do not use a separate source register or immediate value.

The 1-bit source operand field in the opcode is used to to select what byte
Expand All @@ -157,7 +157,7 @@ Examples:
dst_reg = htobe64(dst_reg)

``BPF_FROM_LE`` and ``BPF_FROM_BE`` exist as aliases for ``BPF_TO_LE`` and
``BPF_TO_LE`` respetively.
``BPF_TO_BE`` respectively.


Jump instructions
Expand Down
2 changes: 2 additions & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -13799,6 +13799,7 @@ F: include/net/mptcp.h
F: include/trace/events/mptcp.h
F: include/uapi/linux/mptcp.h
F: net/mptcp/
F: tools/testing/selftests/bpf/*/*mptcp*.c
F: tools/testing/selftests/net/mptcp/

NETWORKING [TCP]
Expand Down Expand Up @@ -21551,6 +21552,7 @@ K: (?:\b|_)xdp(?:\b|_)
XDP SOCKETS (AF_XDP)
M: Björn Töpel <bjorn@kernel.org>
M: Magnus Karlsson <magnus.karlsson@intel.com>
M: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
R: Jonathan Lemon <jonathan.lemon@gmail.com>
L: netdev@vger.kernel.org
L: bpf@vger.kernel.org
Expand Down
2 changes: 1 addition & 1 deletion arch/s390/net/bpf_jit_comp.c
Original file line number Diff line number Diff line change
Expand Up @@ -1809,7 +1809,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
/*
* Three initial passes:
* - 1/2: Determine clobbered registers
* - 3: Calculate program size and addrs arrray
* - 3: Calculate program size and addrs array
*/
for (pass = 1; pass <= 3; pass++) {
if (bpf_jit_prog(&jit, fp, extra_pass, stack_depth)) {
Expand Down
1 change: 1 addition & 0 deletions arch/x86/include/asm/text-patching.h
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ extern void *text_poke(void *addr, const void *opcode, size_t len);
extern void text_poke_sync(void);
extern void *text_poke_kgdb(void *addr, const void *opcode, size_t len);
extern void *text_poke_copy(void *addr, const void *opcode, size_t len);
extern void *text_poke_set(void *addr, int c, size_t len);
extern int poke_int3_handler(struct pt_regs *regs);
extern void text_poke_bp(void *addr, const void *opcode, size_t len, const void *emulate);

Expand Down
67 changes: 57 additions & 10 deletions arch/x86/kernel/alternative.c
Original file line number Diff line number Diff line change
Expand Up @@ -994,7 +994,21 @@ static inline void unuse_temporary_mm(temp_mm_state_t prev_state)
__ro_after_init struct mm_struct *poking_mm;
__ro_after_init unsigned long poking_addr;

static void *__text_poke(void *addr, const void *opcode, size_t len)
static void text_poke_memcpy(void *dst, const void *src, size_t len)
{
memcpy(dst, src, len);
}

static void text_poke_memset(void *dst, const void *src, size_t len)
{
int c = *(const int *)src;

memset(dst, c, len);
}

typedef void text_poke_f(void *dst, const void *src, size_t len);

static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t len)
{
bool cross_page_boundary = offset_in_page(addr) + len > PAGE_SIZE;
struct page *pages[2] = {NULL};
Expand Down Expand Up @@ -1059,7 +1073,7 @@ static void *__text_poke(void *addr, const void *opcode, size_t len)
prev = use_temporary_mm(poking_mm);

kasan_disable_current();
memcpy((u8 *)poking_addr + offset_in_page(addr), opcode, len);
func((u8 *)poking_addr + offset_in_page(addr), src, len);
kasan_enable_current();

/*
Expand Down Expand Up @@ -1087,11 +1101,13 @@ static void *__text_poke(void *addr, const void *opcode, size_t len)
(cross_page_boundary ? 2 : 1) * PAGE_SIZE,
PAGE_SHIFT, false);

/*
* If the text does not match what we just wrote then something is
* fundamentally screwy; there's nothing we can really do about that.
*/
BUG_ON(memcmp(addr, opcode, len));
if (func == text_poke_memcpy) {
/*
* If the text does not match what we just wrote then something is
* fundamentally screwy; there's nothing we can really do about that.
*/
BUG_ON(memcmp(addr, src, len));
}

local_irq_restore(flags);
pte_unmap_unlock(ptep, ptl);
Expand All @@ -1118,7 +1134,7 @@ void *text_poke(void *addr, const void *opcode, size_t len)
{
lockdep_assert_held(&text_mutex);

return __text_poke(addr, opcode, len);
return __text_poke(text_poke_memcpy, addr, opcode, len);
}

/**
Expand All @@ -1137,7 +1153,7 @@ void *text_poke(void *addr, const void *opcode, size_t len)
*/
void *text_poke_kgdb(void *addr, const void *opcode, size_t len)
{
return __text_poke(addr, opcode, len);
return __text_poke(text_poke_memcpy, addr, opcode, len);
}

/**
Expand Down Expand Up @@ -1167,7 +1183,38 @@ void *text_poke_copy(void *addr, const void *opcode, size_t len)

s = min_t(size_t, PAGE_SIZE * 2 - offset_in_page(ptr), len - patched);

__text_poke((void *)ptr, opcode + patched, s);
__text_poke(text_poke_memcpy, (void *)ptr, opcode + patched, s);
patched += s;
}
mutex_unlock(&text_mutex);
return addr;
}

/**
* text_poke_set - memset into (an unused part of) RX memory
* @addr: address to modify
* @c: the byte to fill the area with
* @len: length to copy, could be more than 2x PAGE_SIZE
*
* This is useful to overwrite unused regions of RX memory with illegal
* instructions.
*/
void *text_poke_set(void *addr, int c, size_t len)
{
unsigned long start = (unsigned long)addr;
size_t patched = 0;

if (WARN_ON_ONCE(core_kernel_text(start)))
return NULL;

mutex_lock(&text_mutex);
while (patched < len) {
unsigned long ptr = start + patched;
size_t s;

s = min_t(size_t, PAGE_SIZE * 2 - offset_in_page(ptr), len - patched);

__text_poke(text_poke_memset, (void *)ptr, (void *)&c, s);
patched += s;
}
mutex_unlock(&text_mutex);
Expand Down
79 changes: 55 additions & 24 deletions arch/x86/net/bpf_jit_comp.c
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,11 @@ static void jit_fill_hole(void *area, unsigned int size)
memset(area, 0xcc, size);
}

int bpf_arch_text_invalidate(void *dst, size_t len)
{
return IS_ERR_OR_NULL(text_poke_set(dst, 0xcc, len));
}

struct jit_context {
int cleanup_addr; /* Epilogue code offset */

Expand Down Expand Up @@ -1762,13 +1767,32 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
}

static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
struct bpf_prog *p, int stack_size, bool save_ret)
struct bpf_tramp_link *l, int stack_size,
int run_ctx_off, bool save_ret)
{
u8 *prog = *pprog;
u8 *jmp_insn;
int ctx_cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
struct bpf_prog *p = l->link.prog;
u64 cookie = l->cookie;

/* mov rdi, cookie */
emit_mov_imm64(&prog, BPF_REG_1, (long) cookie >> 32, (u32) (long) cookie);

/* Prepare struct bpf_tramp_run_ctx.
*
* bpf_tramp_run_ctx is already preserved by
* arch_prepare_bpf_trampoline().
*
* mov QWORD PTR [rbp - run_ctx_off + ctx_cookie_off], rdi
*/
emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_1, -run_ctx_off + ctx_cookie_off);

/* arg1: mov rdi, progs[i] */
emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p);
/* arg2: lea rsi, [rbp - ctx_cookie_off] */
EMIT4(0x48, 0x8D, 0x75, -run_ctx_off);

if (emit_call(&prog,
p->aux->sleepable ? __bpf_prog_enter_sleepable :
__bpf_prog_enter, prog))
Expand Down Expand Up @@ -1814,6 +1838,8 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p);
/* arg2: mov rsi, rbx <- start time in nsec */
emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
/* arg3: lea rdx, [rbp - run_ctx_off] */
EMIT4(0x48, 0x8D, 0x55, -run_ctx_off);
if (emit_call(&prog,
p->aux->sleepable ? __bpf_prog_exit_sleepable :
__bpf_prog_exit, prog))
Expand Down Expand Up @@ -1850,24 +1876,24 @@ static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
}

static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
struct bpf_tramp_progs *tp, int stack_size,
bool save_ret)
struct bpf_tramp_links *tl, int stack_size,
int run_ctx_off, bool save_ret)
{
int i;
u8 *prog = *pprog;

for (i = 0; i < tp->nr_progs; i++) {
if (invoke_bpf_prog(m, &prog, tp->progs[i], stack_size,
save_ret))
for (i = 0; i < tl->nr_links; i++) {
if (invoke_bpf_prog(m, &prog, tl->links[i], stack_size,
run_ctx_off, save_ret))
return -EINVAL;
}
*pprog = prog;
return 0;
}

static int invoke_bpf_mod_ret(const struct btf_func_model *m, u8 **pprog,
struct bpf_tramp_progs *tp, int stack_size,
u8 **branches)
struct bpf_tramp_links *tl, int stack_size,
int run_ctx_off, u8 **branches)
{
u8 *prog = *pprog;
int i;
Expand All @@ -1877,8 +1903,8 @@ static int invoke_bpf_mod_ret(const struct btf_func_model *m, u8 **pprog,
*/
emit_mov_imm32(&prog, false, BPF_REG_0, 0);
emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
for (i = 0; i < tp->nr_progs; i++) {
if (invoke_bpf_prog(m, &prog, tp->progs[i], stack_size, true))
for (i = 0; i < tl->nr_links; i++) {
if (invoke_bpf_prog(m, &prog, tl->links[i], stack_size, run_ctx_off, true))
return -EINVAL;

/* mod_ret prog stored return value into [rbp - 8]. Emit:
Expand Down Expand Up @@ -1980,14 +2006,14 @@ static bool is_valid_bpf_tramp_flags(unsigned int flags)
*/
int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *image_end,
const struct btf_func_model *m, u32 flags,
struct bpf_tramp_progs *tprogs,
struct bpf_tramp_links *tlinks,
void *orig_call)
{
int ret, i, nr_args = m->nr_args;
int regs_off, ip_off, args_off, stack_size = nr_args * 8;
struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
struct bpf_tramp_progs *fmod_ret = &tprogs[BPF_TRAMP_MODIFY_RETURN];
int regs_off, ip_off, args_off, stack_size = nr_args * 8, run_ctx_off;
struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
u8 **branches = NULL;
u8 *prog;
bool save_ret;
Expand All @@ -2014,6 +2040,8 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
* RBP - args_off [ args count ] always
*
* RBP - ip_off [ traced function ] BPF_TRAMP_F_IP_ARG flag
*
* RBP - run_ctx_off [ bpf_tramp_run_ctx ]
*/

/* room for return value of orig_call or fentry prog */
Expand All @@ -2032,6 +2060,9 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i

ip_off = stack_size;

stack_size += (sizeof(struct bpf_tramp_run_ctx) + 7) & ~0x7;
run_ctx_off = stack_size;

if (flags & BPF_TRAMP_F_SKIP_FRAME) {
/* skip patched call instruction and point orig_call to actual
* body of the kernel function.
Expand Down Expand Up @@ -2078,19 +2109,19 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
}
}

if (fentry->nr_progs)
if (invoke_bpf(m, &prog, fentry, regs_off,
if (fentry->nr_links)
if (invoke_bpf(m, &prog, fentry, regs_off, run_ctx_off,
flags & BPF_TRAMP_F_RET_FENTRY_RET))
return -EINVAL;

if (fmod_ret->nr_progs) {
branches = kcalloc(fmod_ret->nr_progs, sizeof(u8 *),
if (fmod_ret->nr_links) {
branches = kcalloc(fmod_ret->nr_links, sizeof(u8 *),
GFP_KERNEL);
if (!branches)
return -ENOMEM;

if (invoke_bpf_mod_ret(m, &prog, fmod_ret, regs_off,
branches)) {
run_ctx_off, branches)) {
ret = -EINVAL;
goto cleanup;
}
Expand All @@ -2111,7 +2142,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
prog += X86_PATCH_SIZE;
}

if (fmod_ret->nr_progs) {
if (fmod_ret->nr_links) {
/* From Intel 64 and IA-32 Architectures Optimization
* Reference Manual, 3.4.1.4 Code Alignment, Assembly/Compiler
* Coding Rule 11: All branch targets should be 16-byte
Expand All @@ -2121,13 +2152,13 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
/* Update the branches saved in invoke_bpf_mod_ret with the
* aligned address of do_fexit.
*/
for (i = 0; i < fmod_ret->nr_progs; i++)
for (i = 0; i < fmod_ret->nr_links; i++)
emit_cond_near_jump(&branches[i], prog, branches[i],
X86_JNE);
}

if (fexit->nr_progs)
if (invoke_bpf(m, &prog, fexit, regs_off, false)) {
if (fexit->nr_links)
if (invoke_bpf(m, &prog, fexit, regs_off, run_ctx_off, false)) {
ret = -EINVAL;
goto cleanup;
}
Expand Down
Loading

0 comments on commit 1ef0736

Please sign in to comment.