Skip to content

[WebNN] Support more features for GQA#27234

Open
Honry wants to merge 3 commits intomicrosoft:mainfrom
Honry:support-gqa-with-roe
Open

[WebNN] Support more features for GQA#27234
Honry wants to merge 3 commits intomicrosoft:mainfrom
Honry:support-gqa-with-roe

Conversation

@Honry
Copy link
Contributor

@Honry Honry commented Feb 4, 2026

Add support for GroupQueryAttention with:

  • do_rotary=true (cos_cache/sin_cache inputs)
  • Packed QKV (optional key/value inputs)
  • Optional past_key/past_value for prefill mode
  • Remove fp16->fp32 casting workaround

Add ApplyRotaryEmbedding helper function.

Fix decode stage by using qkv_sequence_length to distinguish prefill vs decode, and use runtime seqlens_k instead of static past_sequence_length for rotary position calculation.

Add support for GroupQueryAttention with:
- do_rotary=true (cos_cache/sin_cache inputs)
- Packed QKV (optional key/value inputs)
- Optional past_key/past_value for prefill mode
- Remove fp16->fp32 casting workaround

Add ApplyRotaryEmbedding helper function.

Fix decode stage by using qkv_sequence_length instead of has_past_key
to distinguish prefill vs decode, and use runtime seqlens_k instead of
static past_sequence_length for rotary position calculation.
@Honry
Copy link
Contributor Author

Honry commented Feb 4, 2026

@fdwr, @guschmue, PTAL, thanks!

fdwr
fdwr previously approved these changes Feb 4, 2026
Copy link
Contributor

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment, else LGTM.

@guschmue guschmue added the ep:WebNN WebNN execution provider label Feb 5, 2026
guschmue
guschmue previously approved these changes Feb 5, 2026
@guschmue
Copy link
Contributor

guschmue commented Feb 5, 2026

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

@guschmue guschmue enabled auto-merge (squash) February 5, 2026 18:15
@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@guschmue
Copy link
Contributor

guschmue commented Feb 5, 2026

run 'lintrunner -a' to make the CI happy

fdwr
fdwr previously approved these changes Feb 6, 2026
Copy link
Contributor

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@fdwr
Copy link
Contributor

fdwr commented Feb 6, 2026

Hmm, linter issues. I can't tell what it's complaining about though:

-    emscripten::val input,// Shape: [batch_size, sequence_length, num_heads, head_size]
-    emscripten::val cos_cache,// Shape: [max_sequence_length, head_size / 2]
-    emscripten::val sin_cache,// Shape: [max_sequence_length, head_size / 2]
-    emscripten::val position_ids,// Shape: [batch_size, sequence_length] or [1]
+    emscripten::val input,// Shape: [batch_size, sequence_length, num_heads, head_size]
+    emscripten::val cos_cache,// Shape: [max_sequence_length, head_size / 2]
+    emscripten::val sin_cache,// Shape: [max_sequence_length, head_size / 2]
+    emscripten::val position_ids,// Shape: [batch_size, sequence_length] or [1]

auto-merge was automatically disabled February 6, 2026 01:29

Head branch was pushed to by a user without write access

@Honry Honry dismissed stale reviews from fdwr and guschmue via 55562e5 February 6, 2026 01:29
@Honry
Copy link
Contributor Author

Honry commented Feb 6, 2026

Thanks much @fdwr, @guschmue, lint error fixed, please help retrigger the CI. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:WebNN WebNN execution provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants