A tracking utility for gathering the compile and/or runtime time, size, profiling and other statistics #4777

AndreyPavlenko · 2025-07-25T16:03:14Z

To enable the tracking, set the environment variable TRITON_TRACK_DUMP to either 1, true, yes, on, y or a path to a directory where the tracking reports will be dumped.
To add the profiling statistics to the reports, set the TRITON_TRACK_PROFILE environment variable.
To track the kernel launches, set the TRITON_TRACK_RUN environment variable.

Link #4716

anmyachev · 2025-08-06T13:30:45Z

third_party/intel/triton_xpu.cc

-      },
-      py::call_guard<py::gil_scoped_release>());


Why removed?

It doesn't allow calling the callback function.

Is it possible to make conditional? For example, still use it if pyCb=std::nullopt.

Now it's released in the beginning of the lambda and acquired on each callback call.

anmyachev · 2025-08-06T13:56:39Z

I would also add tests for this utility so that the code does not become outdated unexpectedly.

Egor-Krivov · 2025-08-11T16:20:11Z

@AndreyPavlenko Will it be possible to distinguish between configurations for the single script like our microbenchmarks? Like if I call kernel with different input parameters, I would probably want to get separate compile time for each input size.

AndreyPavlenko · 2025-08-12T13:54:29Z

@AndreyPavlenko Will it be possible to distinguish between configurations for the single script like our microbenchmarks? Like if I call kernel with different input parameters, I would probably want to get separate compile time for each input size.

There will be separate reports for each compilation.

Egor-Krivov · 2025-08-12T15:13:13Z

@AndreyPavlenko Will it be possible to distinguish between configurations for the single script like our microbenchmarks? Like if I call kernel with different input parameters, I would probably want to get separate compile time for each input size.

There will be separate reports for each compilation.

Can you show how to distinguish between them? I think currently I only see a folder with kernel name and inside a lot of files with similar names, like kernel.run_3842.json. Can I somehow extract which run corresponds to which shape? Maybe I could somehow affect naming, like calling some sort of `profiling.label("m32_n32_k32") and affect naming? Or store all results in one large json based on my provided labels?

AndreyPavlenko · 2025-08-12T20:23:59Z

@AndreyPavlenko Will it be possible to distinguish between configurations for the single script like our microbenchmarks? Like if I call kernel with different input parameters, I would probably want to get separate compile time for each input size.

There will be separate reports for each compilation.

Can you show how to distinguish between them? I think currently I only see a folder with kernel name and inside a lot of files with similar names, like kernel.run_3842.json. Can I somehow extract which run corresponds to which shape? Maybe I could somehow affect naming, like calling some sort of `profiling.label("m32_n32_k32") and affect naming? Or store all results in one large json based on my provided labels?

Currently it has the same name as the kernel name and it's difficult to distinguish. A similar issue is discussed here - #4800 (comment) .

kernel.run_3842.json is related to kernel runs tracking, not the compilation, and you probably don't need it. Just do not set the TRITON_TRACK_RUN env var.

AndreyPavlenko · 2025-08-18T20:21:01Z

Now constexprs are added to the kernel names and the grid is added to the kernel runs.

third_party/intel/backend/track.py

third_party/intel/backend/compiler.py

anmyachev · 2025-08-29T14:30:46Z

.github/workflows/triton-benchmarks.yml

  VERIFY: ${{ (github.event_name == 'pull_request' || github.event_name == 'schedule' || inputs.verify) && '1' || '0' }}
  TAG: ${{ inputs.tag || (github.event_name == 'pull_request' && format('pr-{0}', github.event.number)) || (github.event_name == 'schedule' && 'ci') || 'test' }}
  N_RUNS: ${{ inputs.n_runs || '1' }}
+  TRITON_TRACK_DUMP: "$PWD/reports/track"


Let's make it optional depending on input from user. It can cause overhead, which can generally be avoided.

I would also enable this profiling for some test in intel folder at least.

I'll remove this line. It seems not sufficient, because the dumps are not picked up. It requires adding some additional logic into workflows and probably it's better to do in a separate PR.

anmyachev · 2025-08-29T14:35:44Z

third_party/intel/backend/track.py

+
+
+def _tr_env(name: str, default: str = "", type: Any = str) -> Any:
+    return type(os.environ.get(name, default).strip())


This returns a type, not a value.

It returns the value of the specified type - str, int, etc.

Oh, I see. This is why it is not good to override built-in functions. Let's give for type variable another name.

anmyachev · 2025-08-29T14:40:05Z

third_party/intel/backend/track.py

+# To enable the tracking, set the environment variable ``TRITON_TRACK_DUMP``
+# to either ``1``, ``true``, ``yes``, ``on``, ``y`` or a path to a directory
+# where the tracking reports will be dumped.


Do we really need all these possible values for TRITON_TRACK_DUMP?

I would leave only path to a directory and undefined cases.

It can also print dumps into console. So many values to be consistent with other bool env vars, that support all these values.

third_party/intel/backend/track.py

anmyachev · 2025-08-29T15:01:50Z

third_party/intel/backend/track.py

+
+        return decorator(funcOrName) if callable(funcOrName) else decorator
+
+    # This ugly hook is used to decorate the upstream functions and avoid circular imports.


Why do circular imports appear?

Because we decorate the functions in the triton.runtime.jit module from backend, but the backend is called by that module.

I don't see decorators in triton.runtime.jit, only in backend/compiler.py. Maybe changing the import will help: https://github.com/intel/intel-xpu-backend-for-triton/pull/4777/files#r2310330084.

I've not touched the upstream code. The decorators are injected here. We can't do here something like:

from triton.runtime.jit import JITFunction JITFunction._do_compile = decorate(JITFunction._do_compile)

Because this code is called from triton.runtime.jit and we are getting a circular import.

AndreyPavlenko mentioned this pull request Jul 25, 2025

[test_mxfp8_mxfp4_matmul] The kernels compilation time is too long #4062

Open

AndreyPavlenko force-pushed the AndreyPavlenko/track branch 3 times, most recently from 41015d0 to 1216480 Compare July 25, 2025 20:48

AndreyPavlenko changed the title ~~Implemented compile time/size tracking and profiling utility~~ A tracking utility for gathering the compile and/or runtime time, size, profiling and other statistics Jul 25, 2025

AndreyPavlenko force-pushed the AndreyPavlenko/track branch 2 times, most recently from 7843958 to 9752167 Compare July 29, 2025 13:44

AndreyPavlenko requested review from anmyachev, whitneywhtsang and Egor-Krivov July 29, 2025 14:26

AndreyPavlenko marked this pull request as ready for review July 29, 2025 18:26

AndreyPavlenko force-pushed the AndreyPavlenko/track branch from 9752167 to d2a0de4 Compare July 31, 2025 13:29

Egor-Krivov approved these changes Aug 5, 2025

View reviewed changes

Egor-Krivov mentioned this pull request Aug 6, 2025

Compile Time Tracking for Key Workloads #4716

Open

anmyachev reviewed Aug 6, 2025

View reviewed changes

AndreyPavlenko force-pushed the AndreyPavlenko/track branch from d2a0de4 to 55be9d9 Compare August 18, 2025 20:19

AndreyPavlenko force-pushed the AndreyPavlenko/track branch from 55be9d9 to 7b81c66 Compare August 19, 2025 11:43

vlad-penkin linked an issue Aug 25, 2025 that may be closed by this pull request

Compile Time Tracking for Key Workloads #4716

Open

AndreyPavlenko force-pushed the AndreyPavlenko/track branch 3 times, most recently from f16b622 to c465ae3 Compare August 29, 2025 13:43

anmyachev reviewed Aug 29, 2025

View reviewed changes

third_party/intel/backend/track.py Outdated Show resolved Hide resolved

AndreyPavlenko force-pushed the AndreyPavlenko/track branch from c465ae3 to b15f57b Compare August 29, 2025 14:21

anmyachev requested changes Aug 29, 2025

View reviewed changes

AndreyPavlenko force-pushed the AndreyPavlenko/track branch 2 times, most recently from f0c3086 to 9f72bda Compare September 1, 2025 18:59

Implemented compile time/size tracking and profiling utility

e8d8054

AndreyPavlenko force-pushed the AndreyPavlenko/track branch from 9f72bda to e8d8054 Compare September 1, 2025 19:01



		def _tr_env(name: str, default: str = "", type: Any = str) -> Any:
		return type(os.environ.get(name, default).strip())


		return decorator(funcOrName) if callable(funcOrName) else decorator

		# This ugly hook is used to decorate the upstream functions and avoid circular imports.

A tracking utility for gathering the compile and/or runtime time, size, profiling and other statistics #4777

Are you sure you want to change the base?

A tracking utility for gathering the compile and/or runtime time, size, profiling and other statistics #4777

Uh oh!

Conversation

AndreyPavlenko commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anmyachev commented Aug 6, 2025

Uh oh!

Egor-Krivov commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AndreyPavlenko commented Aug 12, 2025

Uh oh!

Egor-Krivov commented Aug 12, 2025

Uh oh!

AndreyPavlenko commented Aug 12, 2025

Uh oh!

AndreyPavlenko commented Aug 18, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AndreyPavlenko commented Jul 25, 2025 •

edited

Loading

Egor-Krivov commented Aug 11, 2025 •

edited

Loading