UPSTREAM PR #18401: ggml: add ggml_rope_comp #717

loci-dev · 2025-12-27T00:49:00Z

For the motivation of this proposal, please refer to this discussion: ggml-org/ggml#1401

For demo purposes, only 2 backends are implemented in this PR:

CPU
Metal

Main differences

ggml_rope_ext	ggml_rope_comp
One single API call	Multiple composable calls
High-level input params like `freq_base`, YaRN scaling	Lower-level input params, allow most of the static params to be customized by user code
Multiple kernels, one per mode	One single kernel, templated by mode (mrope/vision) or controlled via input arg (ordering neox/normal)
I32 type for position	F32 type for position; shape = 1D for text and 4D for m-rope
m-rope only supports neox ordering	m-rope now supports both neox and normal
Does not support offset n_rot	Allow offset n_rot [1]

[1] This is necessary because we may want to implement 2D-rope (vision mode) as 2 separated calls to ggml_rope_comp, one call with offset = 0 and the other with offset = n_rot/2. This is particularly useful for vision models like Pixtral, where the 2 parts of 2D-rope does not use the same freq configuration.

Performance

I still couldn't get a meaningful perf result due to --output csv not working with test-backend-ops. But at a glance via --output console, it provide the same performance as the existing ggml_rope_ext

ngxson added 20 commits December 15, 2025 15:52

ggml rope v2 demo

5078cfe

fix sin sign

c69a94e

allow f32 as pos input

d01b455

add metal kernel

dd9c7e3

correct metal

0a47037

add test to compare old vs new rope

334bee0

fix tests

1affe19

fix more broken cases

509dfeb

passed non-mrope

a640788

revert changes to mistral3

a0b02a9

correct metal kernel

11d1450

add comment

a4dd3c1

Merge branch 'master' into xsn/rope_v2

e4782fb

all cases ok

957c0ab

chore: update webui build output

8da5506

mrope now requires 4D tensor

6814ed7

add to test-backend-ops

adf9529

metal kernel ok

bcc50b0

test perf

b80eaf7

rm debug mutex

54b0e6b

loci-dev had a problem deploying to PROD__AL_DEMO December 27, 2025 00:49 — with GitHub Actions Failure

loci-dev force-pushed the main branch 9 times, most recently from f14c301 to c7d40d0 Compare December 28, 2025 11:07

loci-dev force-pushed the main branch 5 times, most recently from f2e8c7f to b3f45e1 Compare December 29, 2025 06:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

UPSTREAM PR #18401: ggml: add ggml_rope_comp #717

UPSTREAM PR #18401: ggml: add ggml_rope_comp #717

Uh oh!

loci-dev commented Dec 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

UPSTREAM PR #18401: ggml: add ggml_rope_comp #717

Are you sure you want to change the base?

UPSTREAM PR #18401: ggml: add ggml_rope_comp #717

Uh oh!

Conversation

loci-dev commented Dec 27, 2025

Main differences

Performance

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants