bpftime-super: Extending eBPF Programmability and Observability to GPUs

bpftime-super is the first system to dynamically offload eBPF instrumentation and bytecode directly onto running GPU kernels using real-time PTX injection, significantly reducing instrumentation overhead compared to existing methods.

Installation

git clone https://github.com/eunomia-bpf/bpftime-super.git
cd bpftime-super
make release

eGPU – Extending eBPF Programmability & Observability to GPUs

eGPU is the first open‑source framework that lets you run eBPF programs inside live GPU kernels. By JIT‑translating eBPF byte‑code to NVIDIA PTX at runtime, eGPU injects ultra‑lightweight probes directly into running kernels without pausing or recompiling them. The result is micro‑second‑level visibility into kernel execution, memory transfers and heterogeneous orchestration with minimal overhead.

Why eGPU?

Traditional GPU profilers (CUPTI, NVBit, …) either interrupt kernels or impose high per‑event cost.
Linux eBPF offers elegant, safe instrumentation—but only for CPUs.
Modern AI & HPC workloads need continuous telemetry across both CPU and GPU to catch memory stalls, launch gaps, and anomalous behavior in production.

eGPU bridges that gap by marrying the flexibility of eBPF with the parallel fire‑power of GPUs.

Core capabilities

Capability	How it works	Benefit
Dynamic PTX injection	At load‑time we JIT eBPF → PTX and patch it into the resident kernel	< 1 µs probe overhead on micro‑benchmarks
Shared eBPF maps across CPU & GPU	`boost::managed_shared_memory` exposes the same map to host threads and device code	Zero‑copy metrics exchange
User‑space verifier & JIT (bpftime)	All safety checks stay in user space; no root privileges required	Fast iteration & lower attack surface
Hot‑swap instrumentation	Add / remove probes while kernels keep running	Debug live services without downtime
CXL.mem latency modelling	Optional delay injection emulates tier‑2 memory	Prototype far‑memory systems on today’s hardware

Project highlights

Low overhead: < 5 % runtime impact on memory‑bound kernels up to 128 KB access size (see Fig. 2 of the paper).
Open ecosystem: Works with standard eBPF tooling—clang, bpftool, bpftrace.
Future‑proof: Design anticipates Grace‑Hopper architectures & CXL memory pools.

@article{yang2025bpftimesuper,
      title={eGPU: Extending eBPF Programmability and Observability to GPUs}, 
      author={Yiwei Yang, Yu Tong, Yusheng Zheng, Andrew Quinn},
      year={2025},
      archivePrefix={4th Workshop on Heterogeneous Composable and Disaggregated Systems},
      primaryClass={cs.OS}
}

Name		Name	Last commit message	Last commit date
Latest commit History 270 Commits
.devcontainer		.devcontainer
.github		.github
.vscode		.vscode
attach		attach
benchmark		benchmark
bpftime-verifier		bpftime-verifier
cmake		cmake
daemon		daemon
example		example
runtime		runtime
third_party		third_party
tools		tools
vm		vm
.clang-format		.clang-format
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
CITATION.cff		CITATION.cff
CMakeLists.txt		CMakeLists.txt
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.fedora		Dockerfile.fedora
Dockerfile.ubuntu		Dockerfile.ubuntu
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bpftime-super: Extending eBPF Programmability and Observability to GPUs

Installation

eGPU – Extending eBPF Programmability & Observability to GPUs

Why eGPU?

Core capabilities

Project highlights

About

Uh oh!

Releases

Packages

Languages

License

qq502233945/bpftime-super

Folders and files

Latest commit

History

Repository files navigation

bpftime-super: Extending eBPF Programmability and Observability to GPUs

Installation

eGPU – Extending eBPF Programmability & Observability to GPUs

Why eGPU?

Core capabilities

Project highlights

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages