Add NVTX range in CUDA GPU kernel call of program #1986

philip-paul-mueller · 2025-04-28T05:13:31Z

This changes were made by Ioannis Magkanaris, I only opened the PR.
It adds NVTX ranges around the kernel call generated by DaCe, this allows to easily distinguish it from other CUDA activity such as CuPy.

alexnick83

That's a great idea, and NVTX ranges can also be profiled with nsys. However, would it perhaps be more appropriate to do it through the Instrumentation API? I believe we already have everything in place with GPUEventProvider. The push should happen in on_scope_entry method and the pop in the on_scope_exit method.

@tbennun @phschaad since you have worked on that file the most, do you have a better suggestion?

tbennun · 2025-04-28T21:31:38Z

I agree. This should be possible to implement in a nicer way with the instrumentation API, instrumenting the SDFG (or a state, or a group of maps etc.) with, e.g., a new instrumentation type called GPU_Region. If it is part of the instrumentation, it is also not going to be enabled always by default (the calls might add overhead for very short microsecond-scale SDFGs).

In fact, this could even be implemented in Python with the SDFG call hooks. Here is an example of how to do it in CuPy, which should be even more portable towards AMD GPUs:
https://docs.cupy.dev/en/latest/reference/generated/cupy.cuda.nvtx.RangePush.html

philip-paul-mueller · 2025-04-29T05:05:10Z

Okay we will look into the direction of the GPUEventProvider.
However, I am against of using a hook in Python, as this will for sure add way more overhead to "[...] very short microsecond-scale SDFGs [...]".

Add NVTX range in CUDA GPU kernel call of program

6f7b8c4

alexnick83 requested changes Apr 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add NVTX range in CUDA GPU kernel call of program #1986

Add NVTX range in CUDA GPU kernel call of program #1986

philip-paul-mueller commented Apr 28, 2025

alexnick83 left a comment

tbennun commented Apr 28, 2025 •

edited

Loading

philip-paul-mueller commented Apr 29, 2025

Add NVTX range in CUDA GPU kernel call of program #1986

Are you sure you want to change the base?

Add NVTX range in CUDA GPU kernel call of program #1986

Conversation

philip-paul-mueller commented Apr 28, 2025

alexnick83 left a comment

Choose a reason for hiding this comment

tbennun commented Apr 28, 2025 • edited Loading

philip-paul-mueller commented Apr 29, 2025

tbennun commented Apr 28, 2025 •

edited

Loading