-
Notifications
You must be signed in to change notification settings - Fork 15.5k
BPF: Change callx insn encoding #81546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Currently, the kernel verifier unsupported callx insn used the 32-bit imm field to store the target register. On the other hand, gcc used the dst_reg field to store the target register. The gcc encoding is better. This patch adjusted the coding to be the same as gcc. Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
|
@llvm/pr-subscribers-mc Author: None (yonghong-song) ChangesCurrently, the kernel verifier unsupported callx insn used the 32-bit imm field to store the target register. On the other hand, gcc used the dst_reg field to store the target register. The gcc encoding is better. This patch adjusted the coding to be the same as gcc. Full diff: https://github.com/llvm/llvm-project/pull/81546.diff 3 Files Affected:
diff --git a/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp b/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
index 90697c6645be2f..0d1eef60c3b550 100644
--- a/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
+++ b/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
@@ -229,6 +229,7 @@ struct BPFOperand : public MCParsedAsmOperand {
return StringSwitch<bool>(Name.lower())
.Case("if", true)
.Case("call", true)
+ .Case("callx", true)
.Case("goto", true)
.Case("gotol", true)
.Case("*", true)
diff --git a/llvm/lib/Target/BPF/BPFInstrInfo.td b/llvm/lib/Target/BPF/BPFInstrInfo.td
index 7d443a34490146..690d53420718ff 100644
--- a/llvm/lib/Target/BPF/BPFInstrInfo.td
+++ b/llvm/lib/Target/BPF/BPFInstrInfo.td
@@ -622,9 +622,9 @@ class CALLX<string OpcodeStr>
(ins GPR:$BrDst),
!strconcat(OpcodeStr, " $BrDst"),
[]> {
- bits<32> BrDst;
+ bits<4> BrDst;
- let Inst{31-0} = BrDst;
+ let Inst{51-48} = BrDst;
let BPFClass = BPF_JMP;
}
diff --git a/llvm/test/MC/BPF/insn-unit.s b/llvm/test/MC/BPF/insn-unit.s
index 58342cda7cc0ad..224eb7381aa234 100644
--- a/llvm/test/MC/BPF/insn-unit.s
+++ b/llvm/test/MC/BPF/insn-unit.s
@@ -61,6 +61,9 @@
// CHECK-32: c3 92 10 00 00 00 00 00 lock *(u32 *)(r2 + 16) += w9
// CHECK: db a3 e2 ff 00 00 00 00 lock *(u64 *)(r3 - 30) += r10
+ callx r2
+// CHECK: 8d 02 00 00 00 00 00 00 callx r2
+
// ======== BPF_JMP Class ========
if r1 & r2 goto Llabel0 // BPF_JSET | BPF_X
if r1 & 0xffff goto Llabel0 // BPF_JSET | BPF_K
|
|
cc @jemarch |
|
Windows test failed. The following are failed tests: They are all MLIR python tests, and these failures are not really related to this patch. |
| // CHECK: db a3 e2 ff 00 00 00 00 lock *(u64 *)(r3 - 30) += r10 | ||
|
|
||
| callx r2 | ||
| // CHECK: 8d 02 00 00 00 00 00 00 callx r2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am waiting for a compile to finish, but I wanted to say that it looks right to me. The following is a test case from binutils:
28: 8d 06 00 00 00 00 00 00 callr %r6
Note: The mnemonic will change.
Otherwise, everything looks good! Thanks again.
When clang encounters indirect calls in eBPF programs, it emits a call
instruction with a register parameter (`BPF_X`) instead of an immediate
value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is
decoded by llvm-objdump as `callx`.
For example, here is a simple C program with an indirect call:
extern void (*ptr_to_some_function)(void);
void call_ptr_to_some_function(void) {
ptr_to_some_function();
}
Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0):
$ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf
$ llvm-objdump -rd indirect_call.ebpf
indirect_call.ebpf: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
0000000000000000: R_BPF_64_64 ptr_to_some_function
2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0)
3: 8d 00 00 00 01 00 00 00 callx r1
4: 95 00 00 00 00 00 00 00 exit
Contrary to usual eBPF instruction, `callx`'s register operand is
encoded in the immediate field. This encoding is actually specific to
LLVM (and clang). GCC used the destination register to store the target
register.
LLVM 19.1 was modified to use GCC's encoding:
llvm/llvm-project#81546 ("BPF: Change callx insn
encoding"). For example, in an Alpine Linux 3.21 system:
$ clang -target bpf --version
Alpine clang version 19.1.4
Target: bpf
Thread model: posix
InstalledDir: /usr/lib/llvm19/bin
$ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf
$ llvm-objdump -rd indirect_call.ebpf
indirect_call.ebpf: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
0000000000000000: R_BPF_64_64 ptr_to_some_function
2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0)
3: 8d 01 00 00 00 00 00 00 callx r1
4: 95 00 00 00 00 00 00 00 exit
The instruction is now encoded `8d 01 00...`.
For refrence, here are similar commands using GCC showing it is using
the same encoding (here, compiler option `-mxbpf` is required to enable
several features including indirect calls, cf.
https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ).
$ bpf-gcc --version
bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f]
$ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf
$ bpf-objdump -mxbpf -rd indirect_call.ebpf
indirect_call_gcc-12.ebpf: file format elf64-bpfle
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 00 00 00 00 00 00 00 lddw %r0,0
8: 00 00 00 00 00 00 00 00
0: R_BPF_INSN_64 ptr_to_some_function
10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0]
18: 8d 01 00 00 00 00 00 00 call %r1
20: 95 00 00 00 00 00 00 00 exit
Add both `callx` instruction encodings to eBPF processor.
By the way, the eBPF Verifier used by Linux kernel currently forbids
indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in
https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141
). But other deployments of eBPF may already support this feature.
When clang encounters indirect calls in eBPF programs, it emits a call
instruction with a register parameter (`BPF_X`) instead of an immediate
value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is
decoded by llvm-objdump as `callx`.
For example, here is a simple C program with an indirect call:
extern void (*ptr_to_some_function)(void);
void call_ptr_to_some_function(void) {
ptr_to_some_function();
}
Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0):
$ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf
$ llvm-objdump -rd indirect_call.ebpf
indirect_call.ebpf: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
0000000000000000: R_BPF_64_64 ptr_to_some_function
2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0)
3: 8d 00 00 00 01 00 00 00 callx r1
4: 95 00 00 00 00 00 00 00 exit
Contrary to usual eBPF instruction, `callx`'s register operand is
encoded in the immediate field. This encoding is actually specific to
LLVM (and clang). GCC used the destination register to store the target
register.
LLVM 19.1 was modified to use GCC's encoding:
llvm/llvm-project#81546 ("BPF: Change callx insn
encoding"). For example, in an Alpine Linux 3.21 system:
$ clang -target bpf --version
Alpine clang version 19.1.4
Target: bpf
Thread model: posix
InstalledDir: /usr/lib/llvm19/bin
$ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf
$ llvm-objdump -rd indirect_call.ebpf
indirect_call.ebpf: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
0000000000000000: R_BPF_64_64 ptr_to_some_function
2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0)
3: 8d 01 00 00 00 00 00 00 callx r1
4: 95 00 00 00 00 00 00 00 exit
The instruction is now encoded `8d 01 00...`.
For reference, here are similar commands using GCC showing it is using
the same encoding (here, compiler option `-mxbpf` is required to enable
several features including indirect calls, cf.
https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ).
$ bpf-gcc --version
bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f]
$ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf
$ bpf-objdump -mxbpf -rd indirect_call.ebpf
indirect_call_gcc-12.ebpf: file format elf64-bpfle
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 00 00 00 00 00 00 00 lddw %r0,0
8: 00 00 00 00 00 00 00 00
0: R_BPF_INSN_64 ptr_to_some_function
10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0]
18: 8d 01 00 00 00 00 00 00 call %r1
20: 95 00 00 00 00 00 00 00 exit
Add both `callx` instruction encodings to eBPF processor.
By the way, the eBPF Verifier used by Linux kernel currently forbids
indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in
https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141
). But other deployments of eBPF may already support this feature.
When clang encounters indirect calls in eBPF programs, it emits a call
instruction with a register parameter (`BPF_X`) instead of an immediate
value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is
decoded by llvm-objdump as `callx`.
For example, here is a simple C program with an indirect call:
extern void (*ptr_to_some_function)(void);
void call_ptr_to_some_function(void) {
ptr_to_some_function();
}
Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0):
$ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf
$ llvm-objdump -rd indirect_call.ebpf
indirect_call.ebpf: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
0000000000000000: R_BPF_64_64 ptr_to_some_function
2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0)
3: 8d 00 00 00 01 00 00 00 callx r1
4: 95 00 00 00 00 00 00 00 exit
Contrary to usual eBPF instructions, `callx`'s register operand is
encoded in the immediate field. This encoding is actually specific to
LLVM (and clang). GCC used the destination register to store the target
register.
LLVM 19.1 was modified to use GCC's encoding:
llvm/llvm-project#81546 ("BPF: Change callx insn
encoding"). For example, in an Alpine Linux 3.21 system:
$ clang -target bpf --version
Alpine clang version 19.1.4
Target: bpf
Thread model: posix
InstalledDir: /usr/lib/llvm19/bin
$ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf
$ llvm-objdump -rd indirect_call.ebpf
indirect_call.ebpf: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
0000000000000000: R_BPF_64_64 ptr_to_some_function
2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0)
3: 8d 01 00 00 00 00 00 00 callx r1
4: 95 00 00 00 00 00 00 00 exit
The instruction is now encoded `8d 01 00...`.
For reference, here are similar commands using GCC showing it is using
the same encoding (here, compiler option `-mxbpf` is required to enable
several features including indirect calls, cf.
https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ).
$ bpf-gcc --version
bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f]
$ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf
$ bpf-objdump -mxbpf -rd indirect_call.ebpf
indirect_call_gcc-12.ebpf: file format elf64-bpfle
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 00 00 00 00 00 00 00 lddw %r0,0
8: 00 00 00 00 00 00 00 00
0: R_BPF_INSN_64 ptr_to_some_function
10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0]
18: 8d 01 00 00 00 00 00 00 call %r1
20: 95 00 00 00 00 00 00 00 exit
Add both `callx` instruction encodings to eBPF processor.
By the way, the eBPF Verifier used by Linux kernel currently forbids
indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in
https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141
). But other deployments of eBPF may already support this feature.
When clang encounters indirect calls in eBPF programs, it emits a call
instruction with a register parameter (`BPF_X`) instead of an immediate
value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is
decoded by llvm-objdump as `callx`.
For example, here is a simple C program with an indirect call:
extern void (*ptr_to_some_function)(void);
void call_ptr_to_some_function(void) {
ptr_to_some_function();
}
Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0):
$ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf
$ llvm-objdump -rd indirect_call.ebpf
indirect_call.ebpf: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
0000000000000000: R_BPF_64_64 ptr_to_some_function
2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0)
3: 8d 00 00 00 01 00 00 00 callx r1
4: 95 00 00 00 00 00 00 00 exit
Contrary to usual eBPF instructions, `callx`'s register operand is
encoded in the immediate field. This encoding is actually specific to
LLVM (and clang). GCC used the destination register to store the target
register.
LLVM 19.1 was modified to use GCC's encoding:
llvm/llvm-project#81546 ("BPF: Change callx insn
encoding"). For example, in an Alpine Linux 3.21 system:
$ clang -target bpf --version
Alpine clang version 19.1.4
Target: bpf
Thread model: posix
InstalledDir: /usr/lib/llvm19/bin
$ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf
$ llvm-objdump -rd indirect_call.ebpf
indirect_call.ebpf: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
0000000000000000: R_BPF_64_64 ptr_to_some_function
2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0)
3: 8d 01 00 00 00 00 00 00 callx r1
4: 95 00 00 00 00 00 00 00 exit
The instruction is now encoded `8d 01 00...`.
For reference, here are similar commands using GCC showing it is using
the same encoding (here, compiler option `-mxbpf` is required to enable
several features including indirect calls, cf.
https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ).
$ bpf-gcc --version
bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f]
$ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf
$ bpf-objdump -mxbpf -rd indirect_call.ebpf
indirect_call_gcc-12.ebpf: file format elf64-bpfle
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 00 00 00 00 00 00 00 lddw %r0,0
8: 00 00 00 00 00 00 00 00
0: R_BPF_INSN_64 ptr_to_some_function
10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0]
18: 8d 01 00 00 00 00 00 00 call %r1
20: 95 00 00 00 00 00 00 00 exit
Add both `callx` instruction encodings to eBPF processor.
By the way, the eBPF Verifier used by Linux kernel currently forbids
indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in
https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141
). But other deployments of eBPF may already support this feature.
Changed files:
```
M Ghidra/Processors/MIPS/data/languages/mips.sinc
M Ghidra/Processors/MIPS/data/languages/mips16.sinc
M Ghidra/Processors/eBPF/data/languages/eBPF.sinc
```
Commit details:
```
[Commit 1/7]
Hash: 575dfa7572af0c726fa2d69c512ab486315559e6
Date: 2025-09-10 12:42:00 +0000
Message: GP-5902: Fixed gotos
Files changed:
M Ghidra/Processors/MIPS/data/languages/mips16.sinc
[Commit 2/7]
Hash: a6e9ea090022e9a97dd8411e51caad64dc80e63c
Date: 2025-08-30 15:46:00 +0100
Message: mips: Don't use reserved keywords for names
Files changed:
M Ghidra/Processors/MIPS/data/languages/mips16.sinc
[Commit 3/7]
Hash: a72a68c4612c368c8f9790e586a6246273714ed1
Date: 2025-08-30 14:47:57 +0100
Message: mips: Use & ~1 rather than & -2
Files changed:
M Ghidra/Processors/MIPS/data/languages/mips16.sinc
[Commit 4/7]
Hash: 3c095be95654fb333ea4c22ede44f096b8c341e2
Date: 2025-08-19 20:51:02 +0100
Message: Fix LI failing to match in some cases
Files changed:
M Ghidra/Processors/MIPS/data/languages/mips16.sinc
[Commit 5/7]
Hash: 63919665ec3d07639c6cbe30285640b775c8f099
Date: 2025-08-02 01:42:30 +0100
Message: mips: Correctly handle 64-bit regs in INS and EXT 16e2 instructions
Files changed:
M Ghidra/Processors/MIPS/data/languages/mips16.sinc
[Commit 6/7]
Hash: b31997bba0bcc7502d47060022f8173e42077365
Date: 2025-08-02 01:08:43 +0100
Message: mips: Add mips16e2 instructions
Files changed:
M Ghidra/Processors/MIPS/data/languages/mips.sinc
M Ghidra/Processors/MIPS/data/languages/mips16.sinc
[Commit 7/7]
Hash: 4f3f1059dc67d10db6a82c2c29c93d0d11504401
Date: 2025-04-01 22:24:44 +0200
Message: Add eBPF instruction CALLX for indirect calls
Details:
When clang encounters indirect calls in eBPF programs, it emits a call
instruction with a register parameter (`BPF_X`) instead of an immediate
value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is
decoded by llvm-objdump as `callx`.
For example, here is a simple C program with an indirect call:
extern void (*ptr_to_some_function)(void);
void call_ptr_to_some_function(void) {
ptr_to_some_function();
}
Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0):
$ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf
$ llvm-objdump -rd indirect_call.ebpf
indirect_call.ebpf: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
0000000000000000: R_BPF_64_64 ptr_to_some_function
2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0)
3: 8d 00 00 00 01 00 00 00 callx r1
4: 95 00 00 00 00 00 00 00 exit
Contrary to usual eBPF instructions, `callx`'s register operand is
encoded in the immediate field. This encoding is actually specific to
LLVM (and clang). GCC used the destination register to store the target
register.
LLVM 19.1 was modified to use GCC's encoding:
llvm/llvm-project#81546 ("BPF: Change callx insn
encoding"). For example, in an Alpine Linux 3.21 system:
$ clang -target bpf --version
Alpine clang version 19.1.4
Target: bpf
Thread model: posix
InstalledDir: /usr/lib/llvm19/bin
$ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf
$ llvm-objdump -rd indirect_call.ebpf
indirect_call.ebpf: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
0000000000000000: R_BPF_64_64 ptr_to_some_function
2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0)
3: 8d 01 00 00 00 00 00 00 callx r1
4: 95 00 00 00 00 00 00 00 exit
The instruction is now encoded `8d 01 00...`.
For reference, here are similar commands using GCC showing it is using
the same encoding (here, compiler option `-mxbpf` is required to enable
several features including indirect calls, cf.
https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ).
$ bpf-gcc --version
bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f]
$ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf
$ bpf-objdump -mxbpf -rd indirect_call.ebpf
indirect_call_gcc-12.ebpf: file format elf64-bpfle
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 00 00 00 00 00 00 00 lddw %r0,0
8: 00 00 00 00 00 00 00 00
0: R_BPF_INSN_64 ptr_to_some_function
10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0]
18: 8d 01 00 00 00 00 00 00 call %r1
20: 95 00 00 00 00 00 00 00 exit
Add both `callx` instruction encodings to eBPF processor.
By the way, the eBPF Verifier used by Linux kernel currently forbids
indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in
https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141
). But other deployments of eBPF may already support this feature.
Files changed:
M Ghidra/Processors/eBPF/data/languages/eBPF.sinc
```
Bump Ghidra HEAD commit 53cca61f8
Changed files:
```
M Ghidra/Processors/MIPS/data/languages/mips.sinc
M Ghidra/Processors/MIPS/data/languages/mips16.sinc
M Ghidra/Processors/eBPF/data/languages/eBPF.sinc
```
Commit details:
```
[Commit 1/7]
Hash: 575dfa7572af0c726fa2d69c512ab486315559e6
Date: 2025-09-10 12:42:00 +0000
Message: GP-5902: Fixed gotos
Files changed:
M Ghidra/Processors/MIPS/data/languages/mips16.sinc
[Commit 2/7]
Hash: a6e9ea090022e9a97dd8411e51caad64dc80e63c
Date: 2025-08-30 15:46:00 +0100
Message: mips: Don't use reserved keywords for names
Files changed:
M Ghidra/Processors/MIPS/data/languages/mips16.sinc
[Commit 3/7]
Hash: a72a68c4612c368c8f9790e586a6246273714ed1
Date: 2025-08-30 14:47:57 +0100
Message: mips: Use & ~1 rather than & -2
Files changed:
M Ghidra/Processors/MIPS/data/languages/mips16.sinc
[Commit 4/7]
Hash: 3c095be95654fb333ea4c22ede44f096b8c341e2
Date: 2025-08-19 20:51:02 +0100
Message: Fix LI failing to match in some cases
Files changed:
M Ghidra/Processors/MIPS/data/languages/mips16.sinc
[Commit 5/7]
Hash: 63919665ec3d07639c6cbe30285640b775c8f099
Date: 2025-08-02 01:42:30 +0100
Message: mips: Correctly handle 64-bit regs in INS and EXT 16e2 instructions
Files changed:
M Ghidra/Processors/MIPS/data/languages/mips16.sinc
[Commit 6/7]
Hash: b31997bba0bcc7502d47060022f8173e42077365
Date: 2025-08-02 01:08:43 +0100
Message: mips: Add mips16e2 instructions
Files changed:
M Ghidra/Processors/MIPS/data/languages/mips.sinc
M Ghidra/Processors/MIPS/data/languages/mips16.sinc
[Commit 7/7]
Hash: 4f3f1059dc67d10db6a82c2c29c93d0d11504401
Date: 2025-04-01 22:24:44 +0200
Message: Add eBPF instruction CALLX for indirect calls
Details:
When clang encounters indirect calls in eBPF programs, it emits a call
instruction with a register parameter (`BPF_X`) instead of an immediate
value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is
decoded by llvm-objdump as `callx`.
For example, here is a simple C program with an indirect call:
extern void (*ptr_to_some_function)(void);
void call_ptr_to_some_function(void) {
ptr_to_some_function();
}
Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0):
$ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf
$ llvm-objdump -rd indirect_call.ebpf
indirect_call.ebpf: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
0000000000000000: R_BPF_64_64 ptr_to_some_function
2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0)
3: 8d 00 00 00 01 00 00 00 callx r1
4: 95 00 00 00 00 00 00 00 exit
Contrary to usual eBPF instructions, `callx`'s register operand is
encoded in the immediate field. This encoding is actually specific to
LLVM (and clang). GCC used the destination register to store the target
register.
LLVM 19.1 was modified to use GCC's encoding:
llvm/llvm-project#81546 ("BPF: Change callx insn
encoding"). For example, in an Alpine Linux 3.21 system:
$ clang -target bpf --version
Alpine clang version 19.1.4
Target: bpf
Thread model: posix
InstalledDir: /usr/lib/llvm19/bin
$ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf
$ llvm-objdump -rd indirect_call.ebpf
indirect_call.ebpf: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll
0000000000000000: R_BPF_64_64 ptr_to_some_function
2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0)
3: 8d 01 00 00 00 00 00 00 callx r1
4: 95 00 00 00 00 00 00 00 exit
The instruction is now encoded `8d 01 00...`.
For reference, here are similar commands using GCC showing it is using
the same encoding (here, compiler option `-mxbpf` is required to enable
several features including indirect calls, cf.
https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ).
$ bpf-gcc --version
bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f]
$ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf
$ bpf-objdump -mxbpf -rd indirect_call.ebpf
indirect_call_gcc-12.ebpf: file format elf64-bpfle
Disassembly of section .text:
0000000000000000 <call_ptr_to_some_function>:
0: 18 00 00 00 00 00 00 00 lddw %r0,0
8: 00 00 00 00 00 00 00 00
0: R_BPF_INSN_64 ptr_to_some_function
10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0]
18: 8d 01 00 00 00 00 00 00 call %r1
20: 95 00 00 00 00 00 00 00 exit
Add both `callx` instruction encodings to eBPF processor.
By the way, the eBPF Verifier used by Linux kernel currently forbids
indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in
https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141
). But other deployments of eBPF may already support this feature.
Files changed:
M Ghidra/Processors/eBPF/data/languages/eBPF.sinc
```
Currently, the kernel verifier unsupported callx insn used the 32-bit imm field to store the target register. On the other hand, gcc used the dst_reg field to store the target register. The gcc encoding is better. This patch adjusted the coding to be the same as gcc.