-
Notifications
You must be signed in to change notification settings - Fork 13.6k
BPF: Change callx insn encoding #81546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Currently, the kernel verifier unsupported callx insn used the 32-bit imm field to store the target register. On the other hand, gcc used the dst_reg field to store the target register. The gcc encoding is better. This patch adjusted the coding to be the same as gcc. Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
@llvm/pr-subscribers-mc Author: None (yonghong-song) ChangesCurrently, the kernel verifier unsupported callx insn used the 32-bit imm field to store the target register. On the other hand, gcc used the dst_reg field to store the target register. The gcc encoding is better. This patch adjusted the coding to be the same as gcc. Full diff: https://github.com/llvm/llvm-project/pull/81546.diff 3 Files Affected:
diff --git a/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp b/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
index 90697c6645be2f..0d1eef60c3b550 100644
--- a/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
+++ b/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
@@ -229,6 +229,7 @@ struct BPFOperand : public MCParsedAsmOperand {
return StringSwitch<bool>(Name.lower())
.Case("if", true)
.Case("call", true)
+ .Case("callx", true)
.Case("goto", true)
.Case("gotol", true)
.Case("*", true)
diff --git a/llvm/lib/Target/BPF/BPFInstrInfo.td b/llvm/lib/Target/BPF/BPFInstrInfo.td
index 7d443a34490146..690d53420718ff 100644
--- a/llvm/lib/Target/BPF/BPFInstrInfo.td
+++ b/llvm/lib/Target/BPF/BPFInstrInfo.td
@@ -622,9 +622,9 @@ class CALLX<string OpcodeStr>
(ins GPR:$BrDst),
!strconcat(OpcodeStr, " $BrDst"),
[]> {
- bits<32> BrDst;
+ bits<4> BrDst;
- let Inst{31-0} = BrDst;
+ let Inst{51-48} = BrDst;
let BPFClass = BPF_JMP;
}
diff --git a/llvm/test/MC/BPF/insn-unit.s b/llvm/test/MC/BPF/insn-unit.s
index 58342cda7cc0ad..224eb7381aa234 100644
--- a/llvm/test/MC/BPF/insn-unit.s
+++ b/llvm/test/MC/BPF/insn-unit.s
@@ -61,6 +61,9 @@
// CHECK-32: c3 92 10 00 00 00 00 00 lock *(u32 *)(r2 + 16) += w9
// CHECK: db a3 e2 ff 00 00 00 00 lock *(u64 *)(r3 - 30) += r10
+ callx r2
+// CHECK: 8d 02 00 00 00 00 00 00 callx r2
+
// ======== BPF_JMP Class ========
if r1 & r2 goto Llabel0 // BPF_JSET | BPF_X
if r1 & 0xffff goto Llabel0 // BPF_JSET | BPF_K
|
cc @jemarch |
Windows test failed. The following are failed tests:
They are all MLIR python tests, and these failures are not really related to this patch. |
@@ -61,6 +61,9 @@ | |||
// CHECK-32: c3 92 10 00 00 00 00 00 lock *(u32 *)(r2 + 16) += w9 | |||
// CHECK: db a3 e2 ff 00 00 00 00 lock *(u64 *)(r3 - 30) += r10 | |||
|
|||
callx r2 | |||
// CHECK: 8d 02 00 00 00 00 00 00 callx r2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am waiting for a compile to finish, but I wanted to say that it looks right to me. The following is a test case from binutils:
28: 8d 06 00 00 00 00 00 00 callr %r6
Note: The mnemonic will change.
Otherwise, everything looks good! Thanks again.
When clang encounters indirect calls in eBPF programs, it emits a call instruction with a register parameter (`BPF_X`) instead of an immediate value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is decoded by llvm-objdump as `callx`. For example, here is a simple C program with an indirect call: extern void (*ptr_to_some_function)(void); void call_ptr_to_some_function(void) { ptr_to_some_function(); } Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0): $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0) 3: 8d 00 00 00 01 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit Contrary to usual eBPF instruction, `callx`'s register operand is encoded in the immediate field. This encoding is actually specific to LLVM (and clang). GCC used the destination register to store the target register. LLVM 19.1 was modified to use GCC's encoding: llvm/llvm-project#81546 ("BPF: Change callx insn encoding"). For example, in an Alpine Linux 3.21 system: $ clang -target bpf --version Alpine clang version 19.1.4 Target: bpf Thread model: posix InstalledDir: /usr/lib/llvm19/bin $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0) 3: 8d 01 00 00 00 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit The instruction is now encoded `8d 01 00...`. For refrence, here are similar commands using GCC showing it is using the same encoding (here, compiler option `-mxbpf` is required to enable several features including indirect calls, cf. https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ). $ bpf-gcc --version bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f] $ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf $ bpf-objdump -mxbpf -rd indirect_call.ebpf indirect_call_gcc-12.ebpf: file format elf64-bpfle Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 00 00 00 00 00 00 00 lddw %r0,0 8: 00 00 00 00 00 00 00 00 0: R_BPF_INSN_64 ptr_to_some_function 10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0] 18: 8d 01 00 00 00 00 00 00 call %r1 20: 95 00 00 00 00 00 00 00 exit Add both `callx` instruction encodings to eBPF processor. By the way, the eBPF Verifier used by Linux kernel currently forbids indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141 ). But other deployments of eBPF may already support this feature.
When clang encounters indirect calls in eBPF programs, it emits a call instruction with a register parameter (`BPF_X`) instead of an immediate value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is decoded by llvm-objdump as `callx`. For example, here is a simple C program with an indirect call: extern void (*ptr_to_some_function)(void); void call_ptr_to_some_function(void) { ptr_to_some_function(); } Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0): $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0) 3: 8d 00 00 00 01 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit Contrary to usual eBPF instruction, `callx`'s register operand is encoded in the immediate field. This encoding is actually specific to LLVM (and clang). GCC used the destination register to store the target register. LLVM 19.1 was modified to use GCC's encoding: llvm/llvm-project#81546 ("BPF: Change callx insn encoding"). For example, in an Alpine Linux 3.21 system: $ clang -target bpf --version Alpine clang version 19.1.4 Target: bpf Thread model: posix InstalledDir: /usr/lib/llvm19/bin $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0) 3: 8d 01 00 00 00 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit The instruction is now encoded `8d 01 00...`. For reference, here are similar commands using GCC showing it is using the same encoding (here, compiler option `-mxbpf` is required to enable several features including indirect calls, cf. https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ). $ bpf-gcc --version bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f] $ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf $ bpf-objdump -mxbpf -rd indirect_call.ebpf indirect_call_gcc-12.ebpf: file format elf64-bpfle Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 00 00 00 00 00 00 00 lddw %r0,0 8: 00 00 00 00 00 00 00 00 0: R_BPF_INSN_64 ptr_to_some_function 10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0] 18: 8d 01 00 00 00 00 00 00 call %r1 20: 95 00 00 00 00 00 00 00 exit Add both `callx` instruction encodings to eBPF processor. By the way, the eBPF Verifier used by Linux kernel currently forbids indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141 ). But other deployments of eBPF may already support this feature.
When clang encounters indirect calls in eBPF programs, it emits a call instruction with a register parameter (`BPF_X`) instead of an immediate value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is decoded by llvm-objdump as `callx`. For example, here is a simple C program with an indirect call: extern void (*ptr_to_some_function)(void); void call_ptr_to_some_function(void) { ptr_to_some_function(); } Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0): $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0) 3: 8d 00 00 00 01 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit Contrary to usual eBPF instructions, `callx`'s register operand is encoded in the immediate field. This encoding is actually specific to LLVM (and clang). GCC used the destination register to store the target register. LLVM 19.1 was modified to use GCC's encoding: llvm/llvm-project#81546 ("BPF: Change callx insn encoding"). For example, in an Alpine Linux 3.21 system: $ clang -target bpf --version Alpine clang version 19.1.4 Target: bpf Thread model: posix InstalledDir: /usr/lib/llvm19/bin $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0) 3: 8d 01 00 00 00 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit The instruction is now encoded `8d 01 00...`. For reference, here are similar commands using GCC showing it is using the same encoding (here, compiler option `-mxbpf` is required to enable several features including indirect calls, cf. https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ). $ bpf-gcc --version bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f] $ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf $ bpf-objdump -mxbpf -rd indirect_call.ebpf indirect_call_gcc-12.ebpf: file format elf64-bpfle Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 00 00 00 00 00 00 00 lddw %r0,0 8: 00 00 00 00 00 00 00 00 0: R_BPF_INSN_64 ptr_to_some_function 10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0] 18: 8d 01 00 00 00 00 00 00 call %r1 20: 95 00 00 00 00 00 00 00 exit Add both `callx` instruction encodings to eBPF processor. By the way, the eBPF Verifier used by Linux kernel currently forbids indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141 ). But other deployments of eBPF may already support this feature.
Currently, the kernel verifier unsupported callx insn used the 32-bit imm field to store the target register. On the other hand, gcc used the dst_reg field to store the target register. The gcc encoding is better. This patch adjusted the coding to be the same as gcc.