[kernels] use original forward at compile time #37604

gante · 2025-04-18T09:18:48Z

What does this PR do?

kernels and torch.compile are not yet compatible with each other. Although we can skip custom kernels when the package is not installed, adding an error message is also not feasible -- we can't throw exceptions at compile time.

This PR hijacks the kernels decorator to add a compile-friendly path: until kernels supports torch.compile, let's use the original forward.

github-actions · 2025-04-18T09:18:59Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

gante · 2025-04-18T09:26:38Z

@qubvel I'm not sure how to implement the manual escape path (as discussed on slack and here) 🤔

From the decorator's perspective, we only have access to cls. Some classes where the kernel is applied don't have any way we can set extra flags at model definition time (e.g.), so we can't do something like model.disable_custom_kernels = False with the decorator approach. Not unless we add a lot of extra code.

Using DISABLE_KERNEL_MAPPING=1 should work, though.

Any suggestions or ideas?

HuggingFaceDocBuilderDev · 2025-04-18T09:46:47Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qubvel · 2025-04-18T10:28:02Z

src/transformers/integrations/hub_kernels.py

+            def forward_with_compile_path(*forward_args, **forward_kwargs):
+                if is_torchdynamo_compiling():
+                    return original_forward(*forward_args, **forward_kwargs)
+                else:
+                    return kernel_forward(*forward_args, **forward_kwargs)


Maybe not super clean, but we can have an extra kwarg, smth like that

Suggested change

def forward_with_compile_path(*forward_args, **forward_kwargs):

if is_torchdynamo_compiling():

return original_forward(*forward_args, **forward_kwargs)

else:

return kernel_forward(*forward_args, **forward_kwargs)

def forward_with_compile_path(*forward_args, **forward_kwargs):

use_kernel = forward_kwargs.pop("use_kernel", True)

if is_torchdynamo_compiling() or not use_kernel:

return original_forward(*forward_args, **forward_kwargs)

else:

return kernel_forward(*forward_args, **forward_kwargs)

Then I can manage it on the module call

Yeah, good idea, that could work!

@qubvel it hits a similar barrier: many layers don't receive **kwargs in forward, so they will not be reached by arbitrary kwargs as in e.g. model(..., use_kernel=False) :(

example:

transformers/src/transformers/models/phi3/modeling_phi3.py

Line 240 in f974214

class Phi3RMSNorm(nn.Module):

Custom backends https://pytorch.org/docs/stable/torch.compiler_custom_backends.html could be way to send exceptions at compile time, but probably a huge struggle (as we allow users to actually choose backend) - just posting here in case but I don't really believe in it

@qubvel I see your point: we leave the option to pass the flag, some layers won't do it by default, but at least we have some degree of control 👍

@qubvel added ✅ I went with the config route, since most layers have it and it doesn't require forwarding any argument, but let me know if you'd prefer having control through an argument.

(sorry for being obtuse in the comments above: I was too focused on having a solution that worked in all cases, and completely missed the point that partial support would still be useful 😅 )

Okay, thanks! I just realized neither of the options solves my issue entirely, but at least it will respect the configuration parameter -> the current implementation is going to work. However, for the refactored version with CoreML export support, I want to know in advance which path is going to be executed before passing inputs into the module. I have no idea how to get this, previously, it was resolved the following way:

# kernel path if kernel_loaded and not is_compiling and not custom_kernels_disabled: hidden_state = kernel_forward(hidden_state) # guaranteed to be kernel forward # eager path, avoid 6D tensor else: hidden_state = hidden_state.reshape(...) hidden_state = deform_attn_function(hidden_state)

I will solve it when return to RT_DETR refactoring, and maybe by that time kernels library will be updated, not an urgent issue

Good! We can revisit this later anyways, if we find a better way or new issues 🤗

Ha yes, my idea was to raise directly in the custom backend if kernels are installed and the env variable is active, and force the use of that backend everywhere (thus not changing the kernel decorator). Too bad if it does not work!

gante · 2025-04-18T18:00:16Z

@qubvel @Cyrilvallez let me know if you'd like further changes

(and sorry for being pushy -- CI is broken on several places until this is merged or we update a bunch of CI images)

qubvel

Looks good to me, thanks 👍

Cyrilvallez

LGTM thanks! I like this simple solution for now, IMO we should really aim at not touching the modeling files, otherwise it's gonna scale very badly in the future! 🤗 If truly needed for some models (rt-detr apparently) we can have a local exception, but it should really stay an exception!

compile-accepting kernels

b7dc0f9

github-actions bot marked this pull request as draft April 18, 2025 09:18

Merge branch 'main' into kernels_compile

aefe62d

gante marked this pull request as ready for review April 18, 2025 09:20

gante requested review from Cyrilvallez and qubvel April 18, 2025 09:55

qubvel reviewed Apr 18, 2025

View reviewed changes

gante added 5 commits April 18, 2025 17:12

add flag

9bcb4f5

better comment

79344fe

getattr

83a54bd

getattr

93f80c7

docstring

60bc3f4

qubvel approved these changes Apr 18, 2025

View reviewed changes

Cyrilvallez approved these changes Apr 19, 2025

View reviewed changes

efsotr mentioned this pull request Apr 19, 2025

[FSDP][torch.compile] accelerator.unwrap_model and trainer._save work incorrectly when FSDP + torch.compile #37519

Closed

4 tasks

gante merged commit 1930e75 into huggingface:main Apr 21, 2025
20 checks passed

gante deleted the kernels_compile branch April 21, 2025 12:23

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025

[kernels] use original forward at compile time (huggingface#37604)

e6f68e4

[kernels] use original forward at compile time #37604

[kernels] use original forward at compile time #37604

Uh oh!

Conversation

gante commented Apr 18, 2025

What does this PR do?

Uh oh!

github-actions bot commented Apr 18, 2025

Uh oh!

gante commented Apr 18, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Apr 18, 2025

Uh oh!

qubvel Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

qubvel Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gante Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

gante Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gante Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

gante Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

qubvel Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gante Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez Apr 19, 2025

Choose a reason for hiding this comment

Uh oh!

gante commented Apr 18, 2025

Uh oh!

qubvel left a comment

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

qubvel Apr 18, 2025 •

edited

Loading

gante Apr 18, 2025 •

edited

Loading

Cyrilvallez Apr 18, 2025 •

edited

Loading

qubvel Apr 18, 2025 •

edited

Loading