-
Notifications
You must be signed in to change notification settings - Fork 2k
Add TRITON_IGNORE_LIBTRITON_HASH
for hash stability between Python versions
#6617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When triton changes we should definitely invalidate the cache. I don't understand the description, this is not about updating python, if you use a different triton build you can't expect the cache to be compatible.
Yes we should definitely invalidate the cache when Triton's functionality changes. By 'functionality' I mostly mean how we compile the Triton language -> TTIR -> PTX -> cubin. Conceptually, this functionality should be the same in all Python versions (3.9 to 3.13). This is why I say the compiled cubin should not depend on the Python version. However, in the current implementation, the cache hash depends on the binary content of libtriton, which not only includes the above functionality of Triton, but also includes the C-Python binding (and other things like debug information). When I change the Python version, even if the compiled cubin is exactly the same, the cache hash still changes. That's the problem. In this PR I provide an opt-in way to ignore this dependency. Maybe there is a better way to define the 'functionality' of Triton in the cache hash, such as the Triton package version or the git hash. What do you think? |
I don't follow how changing your python version changes libtriton.so, are you saying that you rebuild libtriton with different python headers?
But this also ignores a lot of real dependencies, I don't see how this can ever be correct.
I don't think we want the build to depend on git. I do think that any rebuild of triton backend should invalidate the cache and there is no way to know what could have changed in the binary. Are you rebuilding triton with different python headers that often? |
The problem is not that I rebuild Triton. Usually (when I release my Windows wheels) I build the same version of Triton for all Python versions (3.9 to 3.13), and users may switch between them.
|
58a9579
to
3746fe8
Compare
Now I get the git hash in the package version using The package version is like We can assume that the git hash changes whenever the functionality of libtriton changes, and it does not change when the Python version changes. Also, I rebased because of #6467 . What do you think? |
it still feels like a footgun to me as any changes not committed would not be considered by the caching, right? |
@ThomasRaoux what do you think about allowing |
Indeed, currently the git hash in |
The idea seems fine to me, but the description of the PR has to be updated. |
@danzimm I agree that a hook/config to directly provide |
This is for reducing the disk usage of the JIT-compiled CUDA binaries (such as
add_kernel.cubin
). For example, when running a large AI model such as Wan with torch.compile autotune, the cached binaries can take 500 MB of disk space, and I would like to avoid recompiling them when switching the Python version.Conceptually, the CUDA binaries generated by Triton should not depend on the Python version (they don't involve any C-Python binding), and I've tested that their binary contents are actually the same between Python versions. However, in the current implementation, the cache folder hash depends on libtriton, which involves the C-Python binding. Also, the binary content of libtriton can depend on some debug infomation such as the tmp path where it's compiled.
I propose to add the environment variable
TRITON_IGNORE_LIBTRITON_HASH
, which can opt in to ignore this dependency. Then the cache can be reused between Python versions, and the user is responsible for clearing the cache if they somehow updated the functionality of libtriton without changing the rest Python code in the package.New contributor declaration
I am not making a trivial change, such as fixing a typo in a comment.
I have written a PR description following these
rules.
I have run
pre-commit run --from-ref origin/main --to-ref HEAD
.Select one of the following.
/test
forlit
tests/unittest
for C++ tests/python/test
for end-to-end testsSelect one of the following.
lit
tests.lit
tests I have added follow these best practices,including the "tests should be minimal" section. (Usually running Python code
and using the instructions it generates is not minimal.)