Skip to content

WIP: Tracing subsystem#703

Draft
vchuravy wants to merge 2 commits into
mainfrom
vc/tracing
Draft

WIP: Tracing subsystem#703
vchuravy wants to merge 2 commits into
mainfrom
vc/tracing

Conversation

@vchuravy

@vchuravy vchuravy commented Jun 8, 2026

Copy link
Copy Markdown
Member

No description provided.

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Benchmark Results

main a123cc5... main / a123cc5...
saxpy/default/Float32/1024 0.0868 ± 0.01 ms 0.0859 ± 0.011 ms 1.01 ± 0.17
saxpy/default/Float32/1048576 0.488 ± 0.026 ms 0.478 ± 0.026 ms 1.02 ± 0.078
saxpy/default/Float32/16384 0.0704 ± 0.034 ms 0.0668 ± 0.033 ms 1.05 ± 0.73
saxpy/default/Float32/2048 0.0851 ± 0.026 ms 0.0849 ± 0.027 ms 1 ± 0.45
saxpy/default/Float32/256 0.0868 ± 0.013 ms 0.0857 ± 0.016 ms 1.01 ± 0.25
saxpy/default/Float32/262144 0.176 ± 0.033 ms 0.168 ± 0.032 ms 1.05 ± 0.28
saxpy/default/Float32/32768 0.0752 ± 0.031 ms 0.0725 ± 0.03 ms 1.04 ± 0.61
saxpy/default/Float32/4096 0.0793 ± 0.031 ms 0.0845 ± 0.031 ms 0.938 ± 0.5
saxpy/default/Float32/512 0.0864 ± 0.011 ms 0.086 ± 0.0099 ms 1 ± 0.17
saxpy/default/Float32/64 0.0872 ± 0.011 ms 0.0859 ± 0.014 ms 1.01 ± 0.21
saxpy/default/Float32/65536 0.0933 ± 0.031 ms 0.0878 ± 0.031 ms 1.06 ± 0.51
saxpy/default/Float64/1024 0.0866 ± 0.013 ms 0.085 ± 0.02 ms 1.02 ± 0.28
saxpy/default/Float64/1048576 0.602 ± 0.098 ms 0.585 ± 0.1 ms 1.03 ± 0.24
saxpy/default/Float64/16384 0.0686 ± 0.03 ms 0.0673 ± 0.03 ms 1.02 ± 0.64
saxpy/default/Float64/2048 0.0857 ± 0.029 ms 0.0846 ± 0.029 ms 1.01 ± 0.49
saxpy/default/Float64/256 0.0864 ± 0.011 ms 0.0854 ± 0.012 ms 1.01 ± 0.19
saxpy/default/Float64/262144 0.2 ± 0.04 ms 0.19 ± 0.04 ms 1.05 ± 0.3
saxpy/default/Float64/32768 0.0796 ± 0.03 ms 0.0768 ± 0.03 ms 1.04 ± 0.56
saxpy/default/Float64/4096 0.0817 ± 0.029 ms 0.0788 ± 0.03 ms 1.04 ± 0.55
saxpy/default/Float64/512 0.086 ± 0.012 ms 0.085 ± 0.012 ms 1.01 ± 0.2
saxpy/default/Float64/64 0.0869 ± 0.013 ms 0.0855 ± 0.017 ms 1.02 ± 0.26
saxpy/default/Float64/65536 0.102 ± 0.031 ms 0.0952 ± 0.03 ms 1.08 ± 0.47
saxpy/static workgroup=(1024,)/Float32/1024 0.084 ± 0.013 ms 0.0825 ± 0.012 ms 1.02 ± 0.21
saxpy/static workgroup=(1024,)/Float32/1048576 0.481 ± 0.025 ms 0.467 ± 0.022 ms 1.03 ± 0.073
saxpy/static workgroup=(1024,)/Float32/16384 0.0661 ± 0.032 ms 0.0635 ± 0.032 ms 1.04 ± 0.72
saxpy/static workgroup=(1024,)/Float32/2048 0.0828 ± 0.027 ms 0.0821 ± 0.028 ms 1.01 ± 0.47
saxpy/static workgroup=(1024,)/Float32/256 0.0846 ± 0.016 ms 0.0833 ± 0.014 ms 1.02 ± 0.26
saxpy/static workgroup=(1024,)/Float32/262144 0.172 ± 0.031 ms 0.162 ± 0.028 ms 1.06 ± 0.27
saxpy/static workgroup=(1024,)/Float32/32768 0.072 ± 0.029 ms 0.0693 ± 0.029 ms 1.04 ± 0.61
saxpy/static workgroup=(1024,)/Float32/4096 0.0804 ± 0.032 ms 0.0815 ± 0.032 ms 0.987 ± 0.55
saxpy/static workgroup=(1024,)/Float32/512 0.0841 ± 0.013 ms 0.0829 ± 0.011 ms 1.01 ± 0.2
saxpy/static workgroup=(1024,)/Float32/64 0.0844 ± 0.015 ms 0.0831 ± 0.017 ms 1.02 ± 0.27
saxpy/static workgroup=(1024,)/Float32/65536 0.0882 ± 0.031 ms 0.0841 ± 0.031 ms 1.05 ± 0.53
saxpy/static workgroup=(1024,)/Float64/1024 0.0833 ± 0.022 ms 0.0827 ± 0.023 ms 1.01 ± 0.38
saxpy/static workgroup=(1024,)/Float64/1048576 0.602 ± 0.084 ms 0.569 ± 0.085 ms 1.06 ± 0.22
saxpy/static workgroup=(1024,)/Float64/16384 0.0661 ± 0.029 ms 0.0638 ± 0.028 ms 1.04 ± 0.64
saxpy/static workgroup=(1024,)/Float64/2048 0.082 ± 0.03 ms 0.0821 ± 0.03 ms 0.998 ± 0.51
saxpy/static workgroup=(1024,)/Float64/256 0.0845 ± 0.014 ms 0.083 ± 0.016 ms 1.02 ± 0.26
saxpy/static workgroup=(1024,)/Float64/262144 0.2 ± 0.038 ms 0.187 ± 0.037 ms 1.07 ± 0.3
saxpy/static workgroup=(1024,)/Float64/32768 0.0768 ± 0.028 ms 0.0741 ± 0.027 ms 1.04 ± 0.54
saxpy/static workgroup=(1024,)/Float64/4096 0.0774 ± 0.03 ms 0.0764 ± 0.03 ms 1.01 ± 0.56
saxpy/static workgroup=(1024,)/Float64/512 0.0843 ± 0.012 ms 0.0829 ± 0.012 ms 1.02 ± 0.21
saxpy/static workgroup=(1024,)/Float64/64 0.0842 ± 0.013 ms 0.0831 ± 0.017 ms 1.01 ± 0.26
saxpy/static workgroup=(1024,)/Float64/65536 0.0993 ± 0.031 ms 0.0922 ± 0.028 ms 1.08 ± 0.47
time_to_load 1.04 ± 0.0071 s 0.996 ± 0.012 s 1.04 ± 0.014

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant