BPF: Change callx insn encoding #81546

yonghong-song · 2024-02-12T22:45:30Z

Currently, the kernel verifier unsupported callx insn used the 32-bit imm field to store the target register. On the other hand, gcc used the dst_reg field to store the target register. The gcc encoding is better. This patch adjusted the coding to be the same as gcc.

Currently, the kernel verifier unsupported callx insn used the 32-bit imm field to store the target register. On the other hand, gcc used the dst_reg field to store the target register. The gcc encoding is better. This patch adjusted the coding to be the same as gcc. Signed-off-by: Yonghong Song <yonghong.song@linux.dev>

llvmbot · 2024-02-12T22:46:01Z

@llvm/pr-subscribers-mc

Author: None (yonghong-song)

Changes

Currently, the kernel verifier unsupported callx insn used the 32-bit imm field to store the target register. On the other hand, gcc used the dst_reg field to store the target register. The gcc encoding is better. This patch adjusted the coding to be the same as gcc.

Full diff: https://github.com/llvm/llvm-project/pull/81546.diff

3 Files Affected:

(modified) llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp (+1)
(modified) llvm/lib/Target/BPF/BPFInstrInfo.td (+2-2)
(modified) llvm/test/MC/BPF/insn-unit.s (+3)

diff --git a/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp b/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
index 90697c6645be2f..0d1eef60c3b550 100644
--- a/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
+++ b/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
@@ -229,6 +229,7 @@ struct BPFOperand : public MCParsedAsmOperand {
     return StringSwitch<bool>(Name.lower())
         .Case("if", true)
         .Case("call", true)
+        .Case("callx", true)
         .Case("goto", true)
         .Case("gotol", true)
         .Case("*", true)
diff --git a/llvm/lib/Target/BPF/BPFInstrInfo.td b/llvm/lib/Target/BPF/BPFInstrInfo.td
index 7d443a34490146..690d53420718ff 100644
--- a/llvm/lib/Target/BPF/BPFInstrInfo.td
+++ b/llvm/lib/Target/BPF/BPFInstrInfo.td
@@ -622,9 +622,9 @@ class CALLX<string OpcodeStr>
                    (ins GPR:$BrDst),
                    !strconcat(OpcodeStr, " $BrDst"),
                    []> {
-  bits<32> BrDst;
+  bits<4> BrDst;
 
-  let Inst{31-0} = BrDst;
+  let Inst{51-48} = BrDst;
   let BPFClass = BPF_JMP;
 }
 
diff --git a/llvm/test/MC/BPF/insn-unit.s b/llvm/test/MC/BPF/insn-unit.s
index 58342cda7cc0ad..224eb7381aa234 100644
--- a/llvm/test/MC/BPF/insn-unit.s
+++ b/llvm/test/MC/BPF/insn-unit.s
@@ -61,6 +61,9 @@
 // CHECK-32: c3 92 10 00 00 00 00 00 	lock *(u32 *)(r2 + 16) += w9
 // CHECK: db a3 e2 ff 00 00 00 00 	lock *(u64 *)(r3 - 30) += r10
 
+  callx r2
+// CHECK: 8d 02 00 00 00 00 00 00 	callx r2
+
 // ======== BPF_JMP Class ========
   if r1 & r2 goto Llabel0    // BPF_JSET  | BPF_X
   if r1 & 0xffff goto Llabel0    // BPF_JSET  | BPF_K

yonghong-song · 2024-02-12T22:47:42Z

cc @jemarch

yonghong-song · 2024-02-12T23:28:12Z

Windows test failed. The following are failed tests:

********************
Failed Tests (30):
  MLIR :: python/dialects/affine.py
  MLIR :: python/dialects/amdgpu.py
  MLIR :: python/dialects/arith_dialect.py
  MLIR :: python/dialects/arith_llvm.py
  MLIR :: python/dialects/cf.py
  MLIR :: python/dialects/func.py
  MLIR :: python/dialects/linalg/opdsl/arguments.py
  MLIR :: python/dialects/linalg/opdsl/assignments.py
  MLIR :: python/dialects/linalg/opdsl/doctests.py
  MLIR :: python/dialects/linalg/opdsl/emit_convolution.py
  MLIR :: python/dialects/linalg/opdsl/emit_matmul.py
  MLIR :: python/dialects/linalg/opdsl/emit_pooling.py
  MLIR :: python/dialects/linalg/opdsl/metadata.py
  MLIR :: python/dialects/linalg/opdsl/shape_maps_iteration.py
  MLIR :: python/dialects/linalg/opdsl/test_core_named_ops.py
  MLIR :: python/dialects/linalg/ops.py
  MLIR :: python/dialects/memref.py
  MLIR :: python/dialects/ml_program.py
  MLIR :: python/dialects/nvgpu.py
  MLIR :: python/dialects/python_test.py
  MLIR :: python/dialects/scf.py
  MLIR :: python/dialects/tensor.py
  MLIR :: python/dialects/transform_bufferization_ext.py
  MLIR :: python/dialects/transform_extras.py
  MLIR :: python/ir/blocks.py
  MLIR :: python/ir/builtin_types.py
  MLIR :: python/ir/diagnostic_handler.py
  MLIR :: python/ir/dialects.py
  MLIR :: python/ir/operation.py
  MLIR :: python/pass_manager.py
Testing Time: 63.76s
Total Discovered Tests: 2561
  Skipped          :    2 (0.08%)
  Unsupported      :  324 (12.65%)
  Passed           : 2202 (85.98%)
  Expectedly Failed:    1 (0.04%)
  Unresolved       :    2 (0.08%)
  Failed           :   30 (1.17%)

They are all MLIR python tests, and these failures are not really related to this patch.

hawkinsw · 2024-02-13T02:04:18Z

llvm/test/MC/BPF/insn-unit.s

 // CHECK: db a3 e2 ff 00 00 00 00 	lock *(u64 *)(r3 - 30) += r10

+  callx r2
+// CHECK: 8d 02 00 00 00 00 00 00 	callx r2


I am waiting for a compile to finish, but I wanted to say that it looks right to me. The following is a test case from binutils:

28: 8d 06 00 00 00 00 00 00 callr %r6

Note: The mnemonic will change.

Otherwise, everything looks good! Thanks again.

When clang encounters indirect calls in eBPF programs, it emits a call instruction with a register parameter (`BPF_X`) instead of an immediate value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is decoded by llvm-objdump as `callx`. For example, here is a simple C program with an indirect call: extern void (*ptr_to_some_function)(void); void call_ptr_to_some_function(void) { ptr_to_some_function(); } Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0): $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0) 3: 8d 00 00 00 01 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit Contrary to usual eBPF instruction, `callx`'s register operand is encoded in the immediate field. This encoding is actually specific to LLVM (and clang). GCC used the destination register to store the target register. LLVM 19.1 was modified to use GCC's encoding: llvm/llvm-project#81546 ("BPF: Change callx insn encoding"). For example, in an Alpine Linux 3.21 system: $ clang -target bpf --version Alpine clang version 19.1.4 Target: bpf Thread model: posix InstalledDir: /usr/lib/llvm19/bin $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0) 3: 8d 01 00 00 00 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit The instruction is now encoded `8d 01 00...`. For refrence, here are similar commands using GCC showing it is using the same encoding (here, compiler option `-mxbpf` is required to enable several features including indirect calls, cf. https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ). $ bpf-gcc --version bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f] $ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf $ bpf-objdump -mxbpf -rd indirect_call.ebpf indirect_call_gcc-12.ebpf: file format elf64-bpfle Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 00 00 00 00 00 00 00 lddw %r0,0 8: 00 00 00 00 00 00 00 00 0: R_BPF_INSN_64 ptr_to_some_function 10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0] 18: 8d 01 00 00 00 00 00 00 call %r1 20: 95 00 00 00 00 00 00 00 exit Add both `callx` instruction encodings to eBPF processor. By the way, the eBPF Verifier used by Linux kernel currently forbids indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141 ). But other deployments of eBPF may already support this feature.

When clang encounters indirect calls in eBPF programs, it emits a call instruction with a register parameter (`BPF_X`) instead of an immediate value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is decoded by llvm-objdump as `callx`. For example, here is a simple C program with an indirect call: extern void (*ptr_to_some_function)(void); void call_ptr_to_some_function(void) { ptr_to_some_function(); } Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0): $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0) 3: 8d 00 00 00 01 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit Contrary to usual eBPF instruction, `callx`'s register operand is encoded in the immediate field. This encoding is actually specific to LLVM (and clang). GCC used the destination register to store the target register. LLVM 19.1 was modified to use GCC's encoding: llvm/llvm-project#81546 ("BPF: Change callx insn encoding"). For example, in an Alpine Linux 3.21 system: $ clang -target bpf --version Alpine clang version 19.1.4 Target: bpf Thread model: posix InstalledDir: /usr/lib/llvm19/bin $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0) 3: 8d 01 00 00 00 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit The instruction is now encoded `8d 01 00...`. For reference, here are similar commands using GCC showing it is using the same encoding (here, compiler option `-mxbpf` is required to enable several features including indirect calls, cf. https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ). $ bpf-gcc --version bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f] $ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf $ bpf-objdump -mxbpf -rd indirect_call.ebpf indirect_call_gcc-12.ebpf: file format elf64-bpfle Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 00 00 00 00 00 00 00 lddw %r0,0 8: 00 00 00 00 00 00 00 00 0: R_BPF_INSN_64 ptr_to_some_function 10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0] 18: 8d 01 00 00 00 00 00 00 call %r1 20: 95 00 00 00 00 00 00 00 exit Add both `callx` instruction encodings to eBPF processor. By the way, the eBPF Verifier used by Linux kernel currently forbids indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141 ). But other deployments of eBPF may already support this feature.

When clang encounters indirect calls in eBPF programs, it emits a call instruction with a register parameter (`BPF_X`) instead of an immediate value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is decoded by llvm-objdump as `callx`. For example, here is a simple C program with an indirect call: extern void (*ptr_to_some_function)(void); void call_ptr_to_some_function(void) { ptr_to_some_function(); } Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0): $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0) 3: 8d 00 00 00 01 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit Contrary to usual eBPF instructions, `callx`'s register operand is encoded in the immediate field. This encoding is actually specific to LLVM (and clang). GCC used the destination register to store the target register. LLVM 19.1 was modified to use GCC's encoding: llvm/llvm-project#81546 ("BPF: Change callx insn encoding"). For example, in an Alpine Linux 3.21 system: $ clang -target bpf --version Alpine clang version 19.1.4 Target: bpf Thread model: posix InstalledDir: /usr/lib/llvm19/bin $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0) 3: 8d 01 00 00 00 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit The instruction is now encoded `8d 01 00...`. For reference, here are similar commands using GCC showing it is using the same encoding (here, compiler option `-mxbpf` is required to enable several features including indirect calls, cf. https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ). $ bpf-gcc --version bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f] $ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf $ bpf-objdump -mxbpf -rd indirect_call.ebpf indirect_call_gcc-12.ebpf: file format elf64-bpfle Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 00 00 00 00 00 00 00 lddw %r0,0 8: 00 00 00 00 00 00 00 00 0: R_BPF_INSN_64 ptr_to_some_function 10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0] 18: 8d 01 00 00 00 00 00 00 call %r1 20: 95 00 00 00 00 00 00 00 exit Add both `callx` instruction encodings to eBPF processor. By the way, the eBPF Verifier used by Linux kernel currently forbids indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141 ). But other deployments of eBPF may already support this feature.

Changed files: ``` M Ghidra/Processors/MIPS/data/languages/mips.sinc M Ghidra/Processors/MIPS/data/languages/mips16.sinc M Ghidra/Processors/eBPF/data/languages/eBPF.sinc ``` Commit details: ``` [Commit 1/7] Hash: 575dfa7572af0c726fa2d69c512ab486315559e6 Date: 2025-09-10 12:42:00 +0000 Message: GP-5902: Fixed gotos Files changed: M Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 2/7] Hash: a6e9ea090022e9a97dd8411e51caad64dc80e63c Date: 2025-08-30 15:46:00 +0100 Message: mips: Don't use reserved keywords for names Files changed: M Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 3/7] Hash: a72a68c4612c368c8f9790e586a6246273714ed1 Date: 2025-08-30 14:47:57 +0100 Message: mips: Use & ~1 rather than & -2 Files changed: M Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 4/7] Hash: 3c095be95654fb333ea4c22ede44f096b8c341e2 Date: 2025-08-19 20:51:02 +0100 Message: Fix LI failing to match in some cases Files changed: M Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 5/7] Hash: 63919665ec3d07639c6cbe30285640b775c8f099 Date: 2025-08-02 01:42:30 +0100 Message: mips: Correctly handle 64-bit regs in INS and EXT 16e2 instructions Files changed: M Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 6/7] Hash: b31997bba0bcc7502d47060022f8173e42077365 Date: 2025-08-02 01:08:43 +0100 Message: mips: Add mips16e2 instructions Files changed: M Ghidra/Processors/MIPS/data/languages/mips.sinc M Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 7/7] Hash: 4f3f1059dc67d10db6a82c2c29c93d0d11504401 Date: 2025-04-01 22:24:44 +0200 Message: Add eBPF instruction CALLX for indirect calls Details: When clang encounters indirect calls in eBPF programs, it emits a call instruction with a register parameter (`BPF_X`) instead of an immediate value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is decoded by llvm-objdump as `callx`. For example, here is a simple C program with an indirect call: extern void (*ptr_to_some_function)(void); void call_ptr_to_some_function(void) { ptr_to_some_function(); } Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0): $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0) 3: 8d 00 00 00 01 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit Contrary to usual eBPF instructions, `callx`'s register operand is encoded in the immediate field. This encoding is actually specific to LLVM (and clang). GCC used the destination register to store the target register. LLVM 19.1 was modified to use GCC's encoding: llvm/llvm-project#81546 ("BPF: Change callx insn encoding"). For example, in an Alpine Linux 3.21 system: $ clang -target bpf --version Alpine clang version 19.1.4 Target: bpf Thread model: posix InstalledDir: /usr/lib/llvm19/bin $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0) 3: 8d 01 00 00 00 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit The instruction is now encoded `8d 01 00...`. For reference, here are similar commands using GCC showing it is using the same encoding (here, compiler option `-mxbpf` is required to enable several features including indirect calls, cf. https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ). $ bpf-gcc --version bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f] $ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf $ bpf-objdump -mxbpf -rd indirect_call.ebpf indirect_call_gcc-12.ebpf: file format elf64-bpfle Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 00 00 00 00 00 00 00 lddw %r0,0 8: 00 00 00 00 00 00 00 00 0: R_BPF_INSN_64 ptr_to_some_function 10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0] 18: 8d 01 00 00 00 00 00 00 call %r1 20: 95 00 00 00 00 00 00 00 exit Add both `callx` instruction encodings to eBPF processor. By the way, the eBPF Verifier used by Linux kernel currently forbids indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141 ). But other deployments of eBPF may already support this feature. Files changed: M Ghidra/Processors/eBPF/data/languages/eBPF.sinc ```

Bump Ghidra HEAD commit 53cca61f8 Changed files: ``` M Ghidra/Processors/MIPS/data/languages/mips.sinc M Ghidra/Processors/MIPS/data/languages/mips16.sinc M Ghidra/Processors/eBPF/data/languages/eBPF.sinc ``` Commit details: ``` [Commit 1/7] Hash: 575dfa7572af0c726fa2d69c512ab486315559e6 Date: 2025-09-10 12:42:00 +0000 Message: GP-5902: Fixed gotos Files changed: M Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 2/7] Hash: a6e9ea090022e9a97dd8411e51caad64dc80e63c Date: 2025-08-30 15:46:00 +0100 Message: mips: Don't use reserved keywords for names Files changed: M Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 3/7] Hash: a72a68c4612c368c8f9790e586a6246273714ed1 Date: 2025-08-30 14:47:57 +0100 Message: mips: Use & ~1 rather than & -2 Files changed: M Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 4/7] Hash: 3c095be95654fb333ea4c22ede44f096b8c341e2 Date: 2025-08-19 20:51:02 +0100 Message: Fix LI failing to match in some cases Files changed: M Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 5/7] Hash: 63919665ec3d07639c6cbe30285640b775c8f099 Date: 2025-08-02 01:42:30 +0100 Message: mips: Correctly handle 64-bit regs in INS and EXT 16e2 instructions Files changed: M Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 6/7] Hash: b31997bba0bcc7502d47060022f8173e42077365 Date: 2025-08-02 01:08:43 +0100 Message: mips: Add mips16e2 instructions Files changed: M Ghidra/Processors/MIPS/data/languages/mips.sinc M Ghidra/Processors/MIPS/data/languages/mips16.sinc [Commit 7/7] Hash: 4f3f1059dc67d10db6a82c2c29c93d0d11504401 Date: 2025-04-01 22:24:44 +0200 Message: Add eBPF instruction CALLX for indirect calls Details: When clang encounters indirect calls in eBPF programs, it emits a call instruction with a register parameter (`BPF_X`) instead of an immediate value (`BPF_K`). This encoding (`BPF_JMP | BPF_CALL | BPF_X = 0x8d`) is decoded by llvm-objdump as `callx`. For example, here is a simple C program with an indirect call: extern void (*ptr_to_some_function)(void); void call_ptr_to_some_function(void) { ptr_to_some_function(); } Compiling and disassembling it gives with clang 14.0 (and LLVM 14.0): $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0) 3: 8d 00 00 00 01 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit Contrary to usual eBPF instructions, `callx`'s register operand is encoded in the immediate field. This encoding is actually specific to LLVM (and clang). GCC used the destination register to store the target register. LLVM 19.1 was modified to use GCC's encoding: llvm/llvm-project#81546 ("BPF: Change callx insn encoding"). For example, in an Alpine Linux 3.21 system: $ clang -target bpf --version Alpine clang version 19.1.4 Target: bpf Thread model: posix InstalledDir: /usr/lib/llvm19/bin $ clang -O2 -target bpf -c indirect_call.c -o indirect_call.ebpf $ llvm-objdump -rd indirect_call.ebpf indirect_call.ebpf: file format elf64-bpf Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 0000000000000000: R_BPF_64_64 ptr_to_some_function 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0x0) 3: 8d 01 00 00 00 00 00 00 callx r1 4: 95 00 00 00 00 00 00 00 exit The instruction is now encoded `8d 01 00...`. For reference, here are similar commands using GCC showing it is using the same encoding (here, compiler option `-mxbpf` is required to enable several features including indirect calls, cf. https://gcc.gnu.org/onlinedocs/gcc-12.4.0/gcc/eBPF-Options.html ). $ bpf-gcc --version bpf-gcc (12-20220319-1ubuntu1+2) 12.0.1 20220319 (experimental) [master r12-7719-g8ca61ad148f] $ bpf-gcc -O2 -c indirect_call.c -o indirect_call.ebpf -mxbpf $ bpf-objdump -mxbpf -rd indirect_call.ebpf indirect_call_gcc-12.ebpf: file format elf64-bpfle Disassembly of section .text: 0000000000000000 <call_ptr_to_some_function>: 0: 18 00 00 00 00 00 00 00 lddw %r0,0 8: 00 00 00 00 00 00 00 00 0: R_BPF_INSN_64 ptr_to_some_function 10: 79 01 00 00 00 00 00 00 ldxdw %r1,[%r0+0] 18: 8d 01 00 00 00 00 00 00 call %r1 20: 95 00 00 00 00 00 00 00 exit Add both `callx` instruction encodings to eBPF processor. By the way, the eBPF Verifier used by Linux kernel currently forbids indirect calls (it fails when `BPF_SRC(insn->code) != BPF_K`, in https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v6.14#n19141 ). But other deployments of eBPF may already support this feature. Files changed: M Ghidra/Processors/eBPF/data/languages/eBPF.sinc ```

llvmbot added the llvm:mc Machine (object) code label Feb 12, 2024

yonghong-song requested a review from 4ast February 12, 2024 22:45

yonghong-song requested a review from eddyz87 February 12, 2024 22:46

4ast approved these changes Feb 12, 2024

View reviewed changes

eddyz87 approved these changes Feb 12, 2024

View reviewed changes

hawkinsw reviewed Feb 13, 2024

View reviewed changes

yonghong-song merged commit c43ad6c into llvm:main Feb 13, 2024

yonghong-song deleted the callx branch February 8, 2025 06:06

niooss-ledger mentioned this pull request Mar 24, 2025

SIMD-0173: SBPF instruction encoding improvements solana-foundation/solana-improvement-documents#173

Merged

niooss-ledger mentioned this pull request Apr 1, 2025

Add eBPF instruction CALLX for indirect calls NationalSecurityAgency/ghidra#7972

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BPF: Change callx insn encoding #81546

BPF: Change callx insn encoding #81546

Uh oh!

yonghong-song commented Feb 12, 2024

Uh oh!

llvmbot commented Feb 12, 2024

Uh oh!

yonghong-song commented Feb 12, 2024

Uh oh!

yonghong-song commented Feb 12, 2024

Uh oh!

hawkinsw Feb 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

BPF: Change callx insn encoding #81546

BPF: Change callx insn encoding #81546

Uh oh!

Conversation

yonghong-song commented Feb 12, 2024

Uh oh!

llvmbot commented Feb 12, 2024

Uh oh!

yonghong-song commented Feb 12, 2024

Uh oh!

yonghong-song commented Feb 12, 2024

Uh oh!

hawkinsw Feb 13, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants