Skip to content

Add a build target to generate ROCm artifacts using ROCm 7.2#19433

Merged
CISC merged 1 commit into
ggml-org:masterfrom
superm1:superm1/rocm-github-action
Feb 21, 2026
Merged

Add a build target to generate ROCm artifacts using ROCm 7.2#19433
CISC merged 1 commit into
ggml-org:masterfrom
superm1:superm1/rocm-github-action

Conversation

@superm1

@superm1 superm1 commented Feb 8, 2026

Copy link
Copy Markdown
Contributor

This builds the following targets:

  • gfx1151
  • gfx1150
  • gfx1200
  • gfx1201
  • gfx1100
  • gfx1101
  • gfx908
  • gfx90a
  • gfx942

@superm1 superm1 requested a review from CISC as a code owner February 8, 2026 15:32
@github-actions github-actions Bot added the devops improvements to build systems and github actions label Feb 8, 2026
@superm1 superm1 force-pushed the superm1/rocm-github-action branch from 1790896 to 8aeb553 Compare February 8, 2026 15:35
@CISC CISC requested a review from IMbackK February 8, 2026 18:09
@IMbackK

IMbackK commented Feb 8, 2026

Copy link
Copy Markdown
Collaborator

@superm1 i would lean towards simply generating the release with 7.1 and including all targets, while rocm unfortunately dose not have a stable abi in practice - the 7.1 compile time + 7.2 runtime combination works fine.

Unless you know of an issue i am not aware with with some non cdna target in 7.1.

@superm1

superm1 commented Feb 8, 2026

Copy link
Copy Markdown
Contributor Author

Yes, there is an incompatibility with mainline kernel on 7.1 for Strix Halo.

It's a long story, but it would be much better to release 7.2 based artifacts.

In this case runtime and compile time are identical because the rocm libraries are added into the artifact not coming from OS.

@IIIIIllllIIIIIlllll

Copy link
Copy Markdown

ROCm/rocm-systems#2865 (comment)

Just a heads-up, ROCm 7.2 currently has some performance issues, and I'm not sure if they've been fixed.

@superm1

superm1 commented Feb 9, 2026

Copy link
Copy Markdown
Contributor Author

Just a heads-up, ROCm 7.2 currently has some performance issues, and I'm not sure if they've been fixed.

My feeling is this is a perfect is the enemy of good situation. I say that because there are no llama.cpp artifacts right now and everyone is compiling their own thing. This gets the ball rolling for official ones and we can all keep improving them.

Yes; there is a regression reported here. It's been root caused to a compiler change and has been reverted in the develop branch but will take a bit to make it's way to a stable release.

There is a workaround right now that can be applied though that avoids it:

-mllvm --amdgpu-unroll-threshold-local=600

@IMbackK

IMbackK commented Feb 9, 2026

Copy link
Copy Markdown
Collaborator

since the compiler is the problem in both cases we could also just downgrade the just compiler in the container by fetching the rocm 7.1 package and installing it.

@superm1

superm1 commented Feb 9, 2026

Copy link
Copy Markdown
Contributor Author

Is that what you would rather see (set up 7.2 container, add 7.1 repos, and downgrade compiler to the one from 7.1)?

I did come up with a workaround for the fp16 issue if you would rather go that way: #19461

@IMbackK

IMbackK commented Feb 9, 2026

Copy link
Copy Markdown
Collaborator

No, #19461 is preferable.

Comment thread .github/workflows/release.yml Outdated
@superm1 superm1 force-pushed the superm1/rocm-github-action branch 7 times, most recently from 050b836 to 2b1e35b Compare February 11, 2026 13:26
@slojosic-amd

slojosic-amd commented Feb 11, 2026

Copy link
Copy Markdown
Contributor

@superm1 @IMbackK we should add -DCMAKE_HIP_FLAGS="-mllvm --amdgpu-unroll-threshold-local=600" here: https://github.com/ggml-org/llama.cpp/pull/19433/changes#diff-87db21a973eed4fef5f32b267aa60fcee5cbdf03c67fafdc2a9b553bb0b15f34R601 if we are planning to add artifacts based on legacy ROCm 7.2 release.
This additional CMake flag is fixing ROCm 7.2 perf regression described here: ROCm/rocm-systems#2865
With this workaround we don't need to downgrade compiler to the one from ROCm 7.1

@superm1 superm1 force-pushed the superm1/rocm-github-action branch from 2b1e35b to 170f2f9 Compare February 11, 2026 20:41
@superm1

superm1 commented Feb 11, 2026

Copy link
Copy Markdown
Contributor Author

That's a great suggestion. I've modified the PR accordingly.

@superm1

superm1 commented Feb 13, 2026

Copy link
Copy Markdown
Contributor Author

Considering the comments in #19594 I have adjusted this PR to do 7.2 in the same way that 7.11 is done. That is have a single artifact.

Basically install the ROCm stack for doing the build, but don't bundle ROCm itself in the artifact. The user would be responsible for installing ROCm to use the artifact.

Here is a CI build from my fork demonstrating how it works now. The artifact is 464MB.

@superm1 superm1 force-pushed the superm1/rocm-github-action branch from 67a425d to 6559a81 Compare February 13, 2026 20:36
@IMbackK

IMbackK commented Feb 13, 2026

Copy link
Copy Markdown
Collaborator

So as mentioned in #19594 i think this one is the better option.
I will take a look at the generated artifacts soon - after that i think we are good to proceed with this one.

@superm1 superm1 force-pushed the superm1/rocm-github-action branch from 6559a81 to d961293 Compare February 14, 2026 01:51
@superm1

superm1 commented Feb 18, 2026

Copy link
Copy Markdown
Contributor Author

Hi @IMbackK can you take a look this week? As I mentioned in #19594 I do think that doing artifacts for both legacy and TheRock builds makes sense. If you agree I can close the 7.11 one and merge it into this one. Or if you would prefer to only do 7.2 this PR should be sufficient on it's own.

@IMbackK

IMbackK commented Feb 18, 2026

Copy link
Copy Markdown
Collaborator

Lets go for just the official release, for one thing the builds are pretty large and the other reason being that new versions of rocm have broken things fairly often and dealing with that at the higher release cadence of therock feels not terribly appealing. Sure we could just not update what therock build we build against, but that would imo defeat the purpose of building against the preview versions at all.

Tiny nit: it would be better if the release name included the version of Ubuntu built against.

Comment thread .github/workflows/release.yml Outdated
@CISC

CISC commented Feb 18, 2026

Copy link
Copy Markdown
Member

Tiny nit: it would be better if the release name included the version of Ubuntu built against.

None of the other releases have it, so it's fine for now.

@IMbackK

IMbackK commented Feb 18, 2026

Copy link
Copy Markdown
Collaborator

Tiny nit: it would be better if the release name included the version of Ubuntu built against.

None of the other releases have it, so it's fine for now.

yes and i think it would be relevant for all releases since it tells you what version of glibc etc its built against.

@CISC

CISC commented Feb 18, 2026

Copy link
Copy Markdown
Member

Tiny nit: it would be better if the release name included the version of Ubuntu built against.

None of the other releases have it, so it's fine for now.

yes and i think it would be relevant for all releases since it tells you what version of glibc etc its built against.

Separate PR if anything, but not too keen on potentially breaking someone's workflow again.

@superm1 superm1 force-pushed the superm1/rocm-github-action branch from fb4909c to 4a1f236 Compare February 18, 2026 22:27

@CISC CISC left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge on @IMbackK approval.

@superm1

superm1 commented Feb 18, 2026

Copy link
Copy Markdown
Contributor Author

Merge on @IMbackK approval.

Can you trigger the rest of the CI jobs?

@CISC

CISC commented Feb 19, 2026

Copy link
Copy Markdown
Member

Merge on @IMbackK approval.

Can you trigger the rest of the CI jobs?

There are no more jobs related to release.yml, the actual release job must be tested on the fork, which I see you sort of did. Did you test the binaries?

@superm1

superm1 commented Feb 19, 2026

Copy link
Copy Markdown
Contributor Author

Yes, just tested them again this morning. This is a Ubuntu 24.04 toolbox, that I installed ROCm 7.2 into and then ran the binaries.

[supermario@toolbx llama-b8028]$ ./llama-cli -m /home/supermario/.cache/huggingface/hub/models--LiquidAI--LFM2-1.2B-GGUF/snapshots/ad410707125b58bc535ac81e21eeb84d56a5e2ee/LFM2-1.2B-Q4_K_M.gguf --ctx-size 4096  --jinja --context-shift --keep 16 --reasoning-format auto -ngl 99
ggml_cuda_init: found 1 ROCm devices:
  Device 0: Radeon 8050S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32
load_backend: loaded ROCm backend from /home/supermario/Downloads/llama-bin-rocm/llama-b8028/libggml-hip.so
load_backend: loaded RPC backend from /home/supermario/Downloads/llama-bin-rocm/llama-b8028/libggml-rpc.so
load_backend: loaded CPU backend from /home/supermario/Downloads/llama-bin-rocm/llama-b8028/libggml-cpu-zen4.so

Loading model...  


▄▄ ▄▄
██ ██
██ ██  ▀▀█▄ ███▄███▄  ▀▀█▄    ▄████ ████▄ ████▄
██ ██ ▄█▀██ ██ ██ ██ ▄█▀██    ██    ██ ██ ██ ██
██ ██ ▀█▄██ ██ ██ ██ ▀█▄██ ██ ▀████ ████▀ ████▀
                                    ██    ██
                                    ▀▀    ▀▀

build      : b8028-92e9e70a
model      : LFM2-1.2B-Q4_K_M.gguf
modalities : text

available commands:
  /exit or Ctrl+C     stop or exit
  /regen              regenerate the last response
  /clear              clear the chat history
  /read               add a text file


> foo the bar

This phrase is often used as a placeholder or a part of a larger phrase or sentence. It's the beginning of "foo the bar," which could mean:

1. **A function or command name**: It could be the name of a simple program or script that performs some action (like `foo`) and then does something else (like `the bar`).
2. **A title or heading**: It might be the title of an article, a section in a document, or a part of a title.
3. **A fragment of code or instruction**: It could be a starting point for a code snippet or a programming instruction.
4. **A creative or artistic statement**: It could be used in a piece of writing or art to introduce a theme or idea.

Without more context, it's hard to determine the exact meaning, but it generally starts with "foo" to indicate a simple or generic term, and "the bar" to add more specificity or context. If you have a specific context or use case in mind, feel free to share more, and I can provide a more detailed response!

[ Prompt: 324.8 t/s | Generation: 186.9 t/s ]

> 

Exiting...
llama_memory_breakdown_print: | memory breakdown [MiB]     | total    free    self   model   context   compute       unaccounted |
llama_memory_breakdown_print: |   - ROCm0 (8050S Graphics) | 12125 = 12632 + ( 878 =   694 +      48 +     136) + 17592186043030 |
llama_memory_breakdown_print: |   - Host                   |                   121 =   105 +       0 +      16                   |

@IMbackK

IMbackK commented Feb 19, 2026

Copy link
Copy Markdown
Collaborator

I just noticed that gfx1030 went missing here too, probubly want to build this for gfx1030 also, since that is an architecture amd both builds for and has official support for. I need to check the release images to see if amd is currently also building for gfx1010 and gfx1031, gfx1032 which are not officially supported architectures but where built for in rocm 7.0

@IMbackK

IMbackK commented Feb 19, 2026

Copy link
Copy Markdown
Collaborator

I tried the release binaries locally on gfx908 and they worked fine here, no issues on that front.

This builds the following targets:
 * gfx1151
 * gfx1150
 * gfx1200
 * gfx1201
 * gfx1100
 * gfx1101
 * gfx1030
 * gfx908
 * gfx90a
 * gfx942
@superm1 superm1 force-pushed the superm1/rocm-github-action branch from 4a1f236 to 3493f6d Compare February 19, 2026 22:04

@IMbackK IMbackK left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no further objections.

Side note:
Since this is bring-your-own-rocm, we could consider in the future also build for architectures not in the default rocm release that are known to work well, like gfx900, gfx906, gfx101x, gfx1031 but just the officially supported architectures for this initial pr is good.

@CISC CISC merged commit f75c4e8 into ggml-org:master Feb 21, 2026
2 checks passed
liparetejas pushed a commit to liparetejas/llama.cpp that referenced this pull request Feb 23, 2026
…g#19433)

This builds the following targets:
 * gfx1151
 * gfx1150
 * gfx1200
 * gfx1201
 * gfx1100
 * gfx1101
 * gfx1030
 * gfx908
 * gfx90a
 * gfx942
bartowski1182 pushed a commit to bartowski1182/llama.cpp that referenced this pull request Mar 2, 2026
…g#19433)

This builds the following targets:
 * gfx1151
 * gfx1150
 * gfx1200
 * gfx1201
 * gfx1100
 * gfx1101
 * gfx1030
 * gfx908
 * gfx90a
 * gfx942
ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request Mar 3, 2026
…g#19433)

This builds the following targets:
 * gfx1151
 * gfx1150
 * gfx1200
 * gfx1201
 * gfx1100
 * gfx1101
 * gfx1030
 * gfx908
 * gfx90a
 * gfx942
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
…g#19433)

This builds the following targets:
 * gfx1151
 * gfx1150
 * gfx1200
 * gfx1201
 * gfx1100
 * gfx1101
 * gfx1030
 * gfx908
 * gfx90a
 * gfx942
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
…g#19433)

This builds the following targets:
 * gfx1151
 * gfx1150
 * gfx1200
 * gfx1201
 * gfx1100
 * gfx1101
 * gfx1030
 * gfx908
 * gfx90a
 * gfx942
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026
…g#19433)

This builds the following targets:
 * gfx1151
 * gfx1150
 * gfx1200
 * gfx1201
 * gfx1100
 * gfx1101
 * gfx1030
 * gfx908
 * gfx90a
 * gfx942
my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request May 15, 2026
…g#19433)

This builds the following targets:
 * gfx1151
 * gfx1150
 * gfx1200
 * gfx1201
 * gfx1100
 * gfx1101
 * gfx1030
 * gfx908
 * gfx90a
 * gfx942
my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request May 15, 2026
…g#19433)

This builds the following targets:
 * gfx1151
 * gfx1150
 * gfx1200
 * gfx1201
 * gfx1100
 * gfx1101
 * gfx1030
 * gfx908
 * gfx90a
 * gfx942
fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026
…g#19433)

This builds the following targets:
 * gfx1151
 * gfx1150
 * gfx1200
 * gfx1201
 * gfx1100
 * gfx1101
 * gfx1030
 * gfx908
 * gfx90a
 * gfx942
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devops improvements to build systems and github actions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants