-
Notifications
You must be signed in to change notification settings - Fork 73
A tracking utility for gathering the compile and/or runtime time, size, profiling and other statistics #4777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
41015d0
to
1216480
Compare
7843958
to
9752167
Compare
9752167
to
d2a0de4
Compare
}, | ||
py::call_guard<py::gil_scoped_release>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't allow calling the callback function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to make conditional? For example, still use it if pyCb=std::nullopt
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it's released in the beginning of the lambda and acquired on each callback call.
I would also add tests for this utility so that the code does not become outdated unexpectedly. |
@AndreyPavlenko Will it be possible to distinguish between configurations for the single script like our microbenchmarks? Like if I call kernel with different input parameters, I would probably want to get separate compile time for each input size. |
There will be separate reports for each compilation. |
Can you show how to distinguish between them? I think currently I only see a folder with kernel name and inside a lot of files with similar names, like |
Currently it has the same name as the kernel name and it's difficult to distinguish. A similar issue is discussed here - #4800 (comment) .
|
d2a0de4
to
55be9d9
Compare
Now constexprs are added to the kernel names and the grid is added to the kernel runs. |
55be9d9
to
7b81c66
Compare
f16b622
to
c465ae3
Compare
c465ae3
to
b15f57b
Compare
VERIFY: ${{ (github.event_name == 'pull_request' || github.event_name == 'schedule' || inputs.verify) && '1' || '0' }} | ||
TAG: ${{ inputs.tag || (github.event_name == 'pull_request' && format('pr-{0}', github.event.number)) || (github.event_name == 'schedule' && 'ci') || 'test' }} | ||
N_RUNS: ${{ inputs.n_runs || '1' }} | ||
TRITON_TRACK_DUMP: "$PWD/reports/track" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make it optional depending on input from user. It can cause overhead, which can generally be avoided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also enable this profiling for some test in intel
folder at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove this line. It seems not sufficient, because the dumps are not picked up. It requires adding some additional logic into workflows and probably it's better to do in a separate PR.
third_party/intel/backend/track.py
Outdated
|
||
|
||
def _tr_env(name: str, default: str = "", type: Any = str) -> Any: | ||
return type(os.environ.get(name, default).strip()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This returns a type, not a value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It returns the value of the specified type - str, int, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. This is why it is not good to override built-in functions. Let's give for type
variable another name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
# To enable the tracking, set the environment variable ``TRITON_TRACK_DUMP`` | ||
# to either ``1``, ``true``, ``yes``, ``on``, ``y`` or a path to a directory | ||
# where the tracking reports will be dumped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need all these possible values for TRITON_TRACK_DUMP
?
I would leave only path to a directory and undefined cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can also print dumps into console. So many values to be consistent with other bool env vars, that support all these values.
third_party/intel/backend/track.py
Outdated
|
||
return decorator(funcOrName) if callable(funcOrName) else decorator | ||
|
||
# This ugly hook is used to decorate the upstream functions and avoid circular imports. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do circular imports appear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we decorate the functions in the triton.runtime.jit module from backend, but the backend is called by that module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see decorators in triton.runtime.jit
, only in backend/compiler.py
. Maybe changing the import will help: https://github.com/intel/intel-xpu-backend-for-triton/pull/4777/files#r2310330084.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not touched the upstream code. The decorators are injected here. We can't do here something like:
from triton.runtime.jit import JITFunction
JITFunction._do_compile = decorate(JITFunction._do_compile)
Because this code is called from triton.runtime.jit and we are getting a circular import.
f0c3086
to
9f72bda
Compare
9f72bda
to
e8d8054
Compare
To enable the tracking, set the environment variable
TRITON_TRACK_DUMP
to either1
,true
,yes
,on
,y
or a path to a directory where the tracking reports will be dumped.To add the profiling statistics to the reports, set the
TRITON_TRACK_PROFILE
environment variable.To track the kernel launches, set the
TRITON_TRACK_RUN
environment variable.Link #4716