Skip to content

Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions#19591

Merged
JohannesGaessler merged 1 commit intoggml-org:masterfrom
superm1:superm1/fix-19580
Feb 16, 2026
Merged

Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions#19591
JohannesGaessler merged 1 commit intoggml-org:masterfrom
superm1:superm1/fix-19580

Conversation

@superm1
Copy link
Contributor

@superm1 superm1 commented Feb 13, 2026

Avoids issues with ROCm 6.4.4.

Closes: #19580
Fixes: 6845f7f ("Add a workaround for compilation with ROCWMMA_FATTN and gfx9 (#19461)")

Avoids issues with ROCm 6.4.4.

Closes: ggml-org#19580
Fixes: 6845f7f ("Add a workaround for compilation with ROCWMMA_FATTN and gfx9 (ggml-org#19461)")
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
@github-actions github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Feb 13, 2026
Copy link
Collaborator

@IMbackK IMbackK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally someone would also test this on 7.0 since we are not exactly sure when this change was introduced. But lets merge this with 6.4 as the cutoff to get things going.

cc: @JohannesGaessler

@JohannesGaessler JohannesGaessler merged commit 2ba9adc into ggml-org:master Feb 16, 2026
74 of 78 checks passed
michaelneale added a commit to michaelneale/llama.cpp that referenced this pull request Feb 17, 2026
* upstream/master: (88 commits)
  ci : bump komac version (ggml-org#19682)
  build : link ws2_32 as PUBLIC on Windows (ggml-org#19666)
  build : cleanup library linking logic (ggml-org#19665)
  convert : add JoyAI-LLM-Flash (ggml-org#19651)
  perplexity: add proper batching (ggml-org#19661)
  common : inline functions (ggml-org#18639)
  ggml : make `ggml_is_view` as API (ggml-org#19539)
  model: Add support for Tiny Aya Models (ggml-org#19611)
  build : rework llama_option_depr to handle LLAMA_CURL (ggml-org#19658)
  Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (ggml-org#19591)
  models : deduplicate delta-net graphs for Qwen family (ggml-org#19597)
  graph : fix KQ mask, lora, cvec reuse checks (ggml-org#19644)
  ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel  (ggml-org#19132)
  sync : ggml
  ggml : bump version to 0.9.7 (ggml/1425)
  ggml : bump version to 0.9.6 (ggml/1423)
  cuda: optimize iq2xxs/iq2xs/iq3xxs dequantization (ggml-org#19624)
  docs: update s390x build docs (ggml-org#19643)
  build : remove LLAMA_HTTPLIB option (ggml-org#19623)
  cmake : check if KleidiAI API has been fetched (ggml-org#19640)
  ...
liparetejas pushed a commit to liparetejas/llama.cpp that referenced this pull request Feb 23, 2026
…ggml-org#19591)

Avoids issues with ROCm 6.4.4.

Closes: ggml-org#19580
Fixes: 6845f7f ("Add a workaround for compilation with ROCWMMA_FATTN and gfx9 (ggml-org#19461)")

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Compile bug: ROCm - error: no matching function for call to 'fill_fragment'

3 participants