Skip to content

Conversation

vchuravy
Copy link
Member

So the idea is that we have a common output format for the CPU backend as well as the GPU backend.
This implements NVTXT so that you can load these annotations with nsys

export KERNELABSTRACTIONS_TIMELINE=true

julia> import KernelAbstractions.Extras.Timeline.NVTXT
julia> NVTXT.push_range("Costly computation")
julia> NVTXT.pop_range()

These leaves a file in the cwd:

cat ka-31089.nvtxt
SetFileDisplayName, KernelAbstractions
@RangeStartEnd, Start, End, ThreadId, Message
ProcessId = 31089
CategoryId = 1
Color = Blue
TimeBase = Manual
@RangePush, Time, ThreadId, Message
ProcessId = 31089
CategoryId = 1
Color = Blue
TimeBase = Manual
@RangePop, Time, ThreadId
ProcessId = 31089
TimeBase = Manual
@Marker, Time, ThreadId, Message
ProcessId = 31089
CategoryId = 1
Color = Blue
TimeBase = Manual
RangePush, 515251367655304, 1, "Costly computation"
RangePop, 515256007484357, 1

Convert to qdrep

LD_LIBRARY_PATH=/opt/nsight-systems-2020.1.1/host-linux-x64/ /opt/nsight-systems-2020.1.1/host-linux-x64/ImportNvtxt --cmd create --nvtxt ka-31089.nvtxt -o report.qdrep

LD_LIBRARY_PATH=/opt/nsight-systems-2020.1.1/host-linux-x64/ /opt/nsight-systems-2020.1.1/host-linux-x64/ImportNvtxt --cmd info -i report.qdrep 
Analysis start (ns)     515251367000000
Analysis end (ns)       515256008000000

@vchuravy
Copy link
Member Author

@lcw is that roughly what you were thinking? I am questioning whether CPU code should have ranges auto-injected or if I should leave that up to the user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant