Skip to content

Conversation

@jeffbolznv
Copy link
Collaborator

  • Move perf_logger from device to ctx.
  • Add an env var to control the frequency we dump the stats. If you set a very large value, it just dumps when the ctx is destroyed.
  • Add a fusion info string to the tracking, only log one item per fused op.
  • Fix MUL_MAT_ID flops calculation.

- Move perf_logger from device to ctx.
- Add an env var to control the frequency we dump the stats. If you set a very
large value, it just dumps when the ctx is destroyed.
- Add a fusion info string to the tracking, only log one item per fused op.
- Fix MUL_MAT_ID flops calculation.
@jeffbolznv jeffbolznv requested a review from 0cc4m as a code owner December 2, 2025 01:50
@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant