-
Notifications
You must be signed in to change notification settings - Fork 151
libbpf: Make optimized uprobes backward compatible #10306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: bpf-next_base
Are you sure you want to change the base?
Conversation
|
Upstream branch: 4722981 |
4a6b8b7 to
1efb39d
Compare
|
Upstream branch: 7dc211c |
1cf9aa9 to
61c93da
Compare
1efb39d to
5b97b4a
Compare
|
Upstream branch: ec12ab2 |
61c93da to
6532d24
Compare
5b97b4a to
7b6b51d
Compare
|
Upstream branch: d6ec090 |
6532d24 to
24f4fdf
Compare
7b6b51d to
2412df8
Compare
|
Upstream branch: d6ec090 |
24f4fdf to
56f0b1d
Compare
2412df8 to
bfb0726
Compare
|
Upstream branch: d088da9 |
56f0b1d to
7ebb850
Compare
bfb0726 to
b0a5b86
Compare
|
Upstream branch: e0940c6 |
7ebb850 to
23eca05
Compare
b0a5b86 to
44cbecf
Compare
|
Upstream branch: 792f258 |
23eca05 to
21e7ff6
Compare
44cbecf to
e8ba78a
Compare
|
Upstream branch: 878ee3c |
21e7ff6 to
f7e2be7
Compare
e8ba78a to
c1a1f03
Compare
|
Upstream branch: ae24fc8 |
f7e2be7 to
07510c4
Compare
c1a1f03 to
ca453f8
Compare
|
Upstream branch: 4dd3a48 |
07510c4 to
05708a4
Compare
ca453f8 to
b3f74e5
Compare
|
Upstream branch: 8f7cf30 |
05708a4 to
27f39da
Compare
b3f74e5 to
b57110f
Compare
|
Upstream branch: c427320 |
We can currently optimize uprobes on top of nop5 instructions, so application can define USDT_NOP to nop5 and use USDT macro to define optimized usdt probes. This works fine on new kernels, but could have performance penalty on older kernels, that do not have the support to optimize and to emulate nop5 instruction. execution of the usdt probe on top of nop: - nop -> trigger usdt -> emulate nop -> continue execution of the usdt probe on top of nop5: - nop5 -> trigger usdt -> single step nop5 -> continue Note the 'single step nop5' as the source of performance regression. To workaround that we change the USDT macro to emit nop,nop5 for the probe (instead of default nop) and make record of that in USDT record (more on that below). This can be detected by application (libbpf) and it can place the uprobe either on nop or nop5 based on the optimization support in the kernel. We make record of using the nop,nop5 instructions in the USDT ELF note data. Current elf note format is as follows: namesz (4B) | descsz (4B) | type (4B) | name | desc And current usdt record (with "stapsdt" name) placed in the note's desc data look like: loc_addr | 8 bytes base_addr | 8 bytes sema_addr | 8 bytes provider | zero terminated string name | zero terminated string args | zero terminated string None of the tested parsers (bpftrace-bcc, libbpf) checked that the args zero terminated byte is the actual end of the 'desc' data. As Andrii suggested we could use this and place extra zero byte right there as an indication for the parser we use the nop,nop5 instructions. It's bit tricky, but the other way would be to introduce new elf note type or note name and change all existing parsers to recognize it. With the change above the existing parsers would still recognize such usdt probes. Note we do not emit this extra byte if app defined its own nop through USDT_NOP macro. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Adding uprobe syscall feature detection that will be used in following changes. Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Adding support to parse extra info in usdt note record that indicates there's nop,nop5 emitted for probe. We detect this by checking extra zero byte placed in between args zero termination byte and desc data end. Please see [1] for more details. Together with uprobe syscall feature detection we can decide if we want to place the probe on top of nop or nop5. [1] https://github.com/libbpf/usdt Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Adding test that attaches bpf program on usdt probe in 2 scenarios; - attach program on top of usdt_1 which is standard nop probe incidentally followed by nop5. The usdt probe does not have extra data in elf note record, so we expect the probe to land on the first nop without being optimized. - attach program on top of usdt_2 which is probe defined on top of nop,nop5 combo. The extra data in the elf note record and presence of upeobe syscall ensures that the probe is placed on top of nop5 and optimized. Signed-off-by: Jiri Olsa <jolsa@kernel.org>
27f39da to
fce0c1d
Compare
Pull request for series with
subject: libbpf: Make optimized uprobes backward compatible
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1024135