[TMTensor] Add lowering of attention with an explicit scale value by AGindinson · Pull Request #4593 · llvm/torch-mlir

AGindinson · 2026-06-03T20:41:27Z

This allows to pass along the scale constant when set at the Torch level and lifts the restriction on dynamic head dim size.

Essentially follows up on commit f53b5be.

This allows to pass along the scale constant when set at the Torch level and lifts the restriction on dynamic head dim size. Essentially follows up on commit f53b5be.

ziereis · 2026-06-11T08:50:12Z

-      if (relativeError > 1e-6) {
-        return rewriter.notifyMatchFailure(
-            loc, "scale must be None or 1/sqrt(headDim)");
+        double expectedScale = 1.0 / std::sqrt(static_cast<double>(headDim));


do we still need this? for my understanding the limitation before for restricting the scale to 1.0/sqrt(head_dim) is because we could not pass the scale as a parameter to the op and IREE would always implicitly assume its 1.0/sqrt(head_dim) if we can set it as attribute now i assume would not need this limitation anymore?

ziereis · 2026-06-11T08:52:08Z

+    mlir::FloatAttr scaleAttr{};
    if (!isa<Torch::NoneType>(scale.getType())) {
      double scaleFloat;
      if (!matchPattern(scale, m_TorchConstantFloat(&scaleFloat)))


i guess it would even be nice we could just make the scale a regular input to the op so we don't have to require it to be a constant. However i see how this might be a bit complicated with variadic arg handling the op currently has.

[TMTensor] Add lowering of attention with an explicit scale value

c968660

This allows to pass along the scale constant when set at the Torch level and lifts the restriction on dynamic head dim size. Essentially follows up on commit f53b5be.

AGindinson mentioned this pull request Jun 3, 2026

[Torch] Support explicit scale values from tm_tensor.attention iree-org/iree#24566

Draft

AGindinson marked this pull request as ready for review June 3, 2026 20:53

AGindinson requested review from Groverkss, jtuyls and zjgarvey June 3, 2026 20:54

ziereis reviewed Jun 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TMTensor] Add lowering of attention with an explicit scale value#4593

[TMTensor] Add lowering of attention with an explicit scale value#4593
AGindinson wants to merge 1 commit into
llvm:mainfrom
AGindinson:tmtensor-attention

AGindinson commented Jun 3, 2026

Uh oh!

ziereis Jun 11, 2026 •

edited

Loading

Uh oh!

ziereis Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

AGindinson commented Jun 3, 2026

Uh oh!

ziereis Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ziereis Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ziereis Jun 11, 2026 •

edited

Loading