Skip to content

Comments

libbpf: Make optimized uprobes backward compatible#11117

Open
kernel-patches-daemon-bpf[bot] wants to merge 5 commits intobpf-next_basefrom
series/1055870=>bpf-next
Open

libbpf: Make optimized uprobes backward compatible#11117
kernel-patches-daemon-bpf[bot] wants to merge 5 commits intobpf-next_basefrom
series/1055870=>bpf-next

Conversation

@kernel-patches-daemon-bpf
Copy link

Pull request for series with
subject: libbpf: Make optimized uprobes backward compatible
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1055870

Adding uprobe syscall feature detection that will be used
in following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Adding support to detect nop,nop5 instructions combo for usdt probe
by checking on probe's following nop5 instruction.

When the nop,nop5 combo is detected together with uprobe syscall,
we can place the probe on top of nop5 and get it optimized.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Syncing latest usdt.h change [1].

Now that we have nop5 optimization support in kernel, let's emit
nop,nop5 for usdt probe. We leave it up to the library to use
desirable nop instruction.

[1] libbpf/usdt@c9865d1
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Adding test that attaches bpf program on usdt probe in 2 scenarios;

- attach program on top of usdt_1, which is single nop instruction,
  so the probe stays on nop instruction and is not optimized.

- attach program on top of usdt_2 which is probe defined on top
  of nop,nop5 combo, so the probe is placed on top of nop5 and
  is optimized.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Adding usdt trigger bench for usdt:
 trig-usdt-nop  - usdt on top of nop1 instruction
 trig-usdt-nop5 - usdt on top of nop1/nop5 combo

Adding it to benchs/run_bench_uprobes.sh script.

Example run on x86_64 kernel with uprobe syscall:

  # ./benchs/run_bench_uprobes.sh
  usermode-count :  152.507 ± 0.098M/s
  syscall-count  :   14.309 ± 0.093M/s
  uprobe-nop     :    3.190 ± 0.012M/s
  uprobe-push    :    3.057 ± 0.004M/s
  uprobe-ret     :    1.095 ± 0.009M/s
  uprobe-nop5    :    7.305 ± 0.034M/s
  uretprobe-nop  :    2.175 ± 0.005M/s
  uretprobe-push :    2.109 ± 0.003M/s
  uretprobe-ret  :    0.945 ± 0.002M/s
  uretprobe-nop5 :    3.530 ± 0.006M/s
  usdt-nop       :    3.235 ± 0.008M/s   <-- added
  usdt-nop5      :    7.511 ± 0.045M/s   <-- added

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
@kernel-patches-daemon-bpf
Copy link
Author

Upstream branch: 9cd168a
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1055870
version: 1

@kernel-patches-review-bot
Copy link

commit 9c703f7538e0e5fac8dd915ab2bcff01a5ded064
Author: Jiri Olsa <jolsa@kernel.org>

libbpf: Add support to detect nop,nop5 instructions combo for usdt probe

Adds detection of the nop,nop5 instruction combo at USDT probe locations
on x86_64. When the kernel supports the uprobe syscall and the combo is
found, the probe is shifted +1 byte onto the nop5 for optimization.

> diff --git a/tools/lib/bpf/usdt.c b/tools/lib/bpf/usdt.c
> index d1524f6f54ae..c81e0a58caf5 100644
> --- a/tools/lib/bpf/usdt.c
> +++ b/tools/lib/bpf/usdt.c

[ ... ]

> +#if defined(__x86_64__)
> +static bool has_nop_combo(int fd, long off)
> +{
> +	static unsigned char nop_combo[6] = {
           ^^^^^^^^^^^^^^^^^^^^^^

Should nop_combo be declared const? It is a read-only lookup table
used only in the memcmp() below.

> +		0x90, 0x0f, 0x1f, 0x44, 0x00, 0x00 /* nop,nop5 */
> +	};
> +	unsigned char buf[6] = {};
> +
> +	if (pread(fd, buf, 6, off) != 6)
> +		return false;
> +	return memcmp(buf, nop_combo, 6) == 0;
> +}

[ ... ]



AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

In-Reply-To-Subject: libbpf: Add support to detect nop,nop5 instructions combo for usdt probe
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/22221513473

AI-authorship-score: low
AI-authorship-explanation: Domain-specific optimization from established BPF/tracing maintainer with iterative lore review history and deep x86_64 uprobe internals knowledge.
issues-found: 1
issue-severity-score: low
issue-severity-explanation: Missing const qualifier on a static read-only byte array is a minor C best practice deviation with no runtime impact.

@kernel-patches-daemon-bpf
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant