Skip to content

Conversation

@Forsworns
Copy link

Description

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Tested via the provided demo.

@Forsworns
Copy link
Author

I just found there were some typos in the README and the binary was included.

@yunwei37 yunwei37 requested a review from Copilot October 27, 2025 12:25
@yunwei37
Copy link
Member

Thanks a lot!

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds early exit functionality for CUDA kernels through eBPF helpers, enabling kernel atomization capabilities similar to network packet filtering. It introduces two new BPF helper functions (bpf_cuda_exit and bpf_get_grid_dim) and provides a complete demonstration through a vector addition example with partition-based execution control.

Key changes:

  • Two new BPF helper functions (507: exit, 508: get_grid_dim) for CUDA kernel control
  • Complete atomizer example with partition-based block filtering
  • PTX-level early exit implementation via inline assembly

Reviewed Changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
attach/nv_attach_impl/trampoline/default_trampoline.cu Implements the two new BPF helper functions in CUDA
attach/nv_attach_impl/trampoline_ptx.h Adds PTX assembly for the new helper functions
attach/nv_attach_impl/nv_attach_impl_patcher.cpp Registers the new helper functions (507 and 508)
example/gpu/atomizer/atomizer.bpf.c eBPF program implementing partition-based kernel atomization
example/gpu/atomizer/atomizer.c Userspace loader for the eBPF atomizer program
example/gpu/atomizer/vec_add.cu CUDA vector addition demo application
example/gpu/atomizer/main.ptx Generated PTX assembly from the demo
example/gpu/atomizer/filter_hashtag.py Utility script to filter preprocessor directives
example/gpu/atomizer/README.md Documentation for the atomizer example
example/gpu/atomizer/Makefile Build configuration for the atomizer example
example/gpu/atomizer/.gitignore Git ignore rules for build artifacts

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Add two bpf helper functions for CUDA.
- Add an early-exit demo in CUDA examples.

Close eunomia-bpf#459

Signed-off-by: Forsworns <yangpeihao@sjtu.edu.cn>
@Forsworns
Copy link
Author

I have addressed typos. But I found another problem:

when I set the launching configuration in vector_add.cu to vectorAdd<<<10, 1>>>(d_A, d_B, d_C);,
and try to read the pre-configured partition number/index from the BPF maps in the atomizer.bpf.c, it sometimes returns null pointer for the given key. I'm not sure where the problem is.

Currently, I only launch a single block in the vector_add.c and it works well. But then only part of threads are exited, instead of the whole thread blocks. Thus, the semantic is different from the LithOS. :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] CUDA kernel early exit demo

3 participants