Skip to content

Comments

feat(agnocast_kmod): replace do_exit kprobe with sched_process_exit tracepoint#1089

Draft
k1832 wants to merge 2 commits intotier4:mainfrom
k1832:kmod/2.6-tracepoint-exit-hook
Draft

feat(agnocast_kmod): replace do_exit kprobe with sched_process_exit tracepoint#1089
k1832 wants to merge 2 commits intotier4:mainfrom
k1832:kmod/2.6-tracepoint-exit-hook

Conversation

@k1832
Copy link
Contributor

@k1832 k1832 commented Feb 20, 2026

Description

Replace the do_exit kprobe with a sched_process_exit tracepoint for process exit detection.

Why:

Aspect kprobe on do_exit sched_process_exit tracepoint
Stability do_exit is an internal symbol — no stability guarantee across kernel versions Stable kernel ABI since Linux 2.6.35 (2010)
Hook mechanism int3 breakpoint trap (x86) or BRK (arm64) Static key (nop when disabled, jmp when enabled)
Cost per invocation ~0.5-1.0 us (trap + single-step) ~20-50 ns (direct call)
Preemption Handler runs with preemption disabled Callback runs in normal preemptible context
Unload safety unregister_kprobe() — no explicit "all callbacks done" guarantee tracepoint_synchronize_unregister() — formal guarantee no callbacks in flight
Task context Uses current->pid Receives task_struct * directly — correct by construction
Version compat Needs #if LINUX_VERSION_CODE guards (6.2+ __noreturn, 6.7+ inlining) Identical API across all target kernels (5.x, 6.x)

The tracepoint symbol (__tracepoint_sched_process_exit) is not exported to modules, so we use dynamic lookup via for_each_kernel_tracepoint() + tracepoint_probe_register(), both of which are EXPORT_SYMBOL_GPL and compatible with the module's MODULE_LICENSE("Dual BSD/GPL").

Behavioral equivalence: The kprobe handler and tracepoint callback both call enqueue_exit_pid(pid). The cleanup logic (process_exit_cleanup, exit_worker_thread, ring buffer) is completely unchanged. task->pid is valid at both hook points — the PID is not freed until exit_notify()release_task(), which runs after the tracepoint.

Benchmark (50000 × /bin/true on same machine):

Branch real user sys
main (kprobe) 3m10.299s 0m29.098s 2m40.464s
tracepoint 3m10.931s 0m28.792s 2m40.739s

No measurable throughput difference. This is expected — the ~1 us per-call saving is invisible in a test dominated by fork+exec+exit cost (~3.8 ms per iteration). The real value is portability (stable API), unload safety (tracepoint_synchronize_unregister), and eliminating preemption-disabled windows.

Related links

How was this PR tested?

  • Autoware (required)
  • bash scripts/test/e2e_test_1to1 (required)
  • bash scripts/test/e2e_test_2to2 (required)
  • kunit tests (required when modifying the kernel module)
  • sample application

KUnit tests do not need changes — they call enqueue_exit_pid() / process_exit_cleanup() directly and do not exercise the hook mechanism.

Notes for reviewers

  • for_each_kernel_tracepoint() is used for dynamic tracepoint lookup because the static tracepoint symbol is not exported to modules. This is a standard pattern used by other out-of-tree modules.
  • tracepoint_synchronize_unregister() is called during module exit to guarantee no callbacks are in flight before the module is unloaded.

…it tracepoint

Replace the kprobe on the internal `do_exit` symbol with the stable
`sched_process_exit` tracepoint. This improves portability (do_exit is
not a stable kernel API), reduces per-exit overhead (~20x less latency),
and provides a formal unload safety guarantee via
tracepoint_synchronize_unregister().

Use for_each_kernel_tracepoint() for dynamic lookup since the tracepoint
symbol is not exported to modules.

Signed-off-by: Keita Morisaki <keita.morisaki@tier4.jp>
Signed-off-by: Keita Morisaki <keita.morisaki@tier4.jp>
@Koichi98 Koichi98 added next-release need-patch-update Bug fixes and other changes - requires PATCH version update labels Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

need-patch-update Bug fixes and other changes - requires PATCH version update next-release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants