Upgrade llama.cpp from b9151 to b9172 by bernardladenthin · Pull Request #143 · bernardladenthin/java-llama.cpp

bernardladenthin · 2026-05-15T23:05:24Z

Summary

Upgrades the pinned llama.cpp version from b9151 to b9172 and adds necessary build configuration to handle upstream changes in WebUI asset provisioning.

Changes Made

CMakeLists.txt: Updated GIT_TAG from b9151 to b9172 and added set(LLAMA_BUILD_WEBUI OFF CACHE BOOL "" FORCE) to disable WebUI asset building (upstream now provisions assets at build time via HF bucket by default)
README.md: Updated llama.cpp version badge and link from b9151 to b9172
CLAUDE.md:
- Updated pinned version reference from b9151 to b9172
- Added changelog entries documenting upstream changes between b9151–b9172:
  - clip_has_whisper_encoder() removed from public API (no project impact)
  - WebUI assets now provisioned at build time; project disables asset download via LLAMA_BUILD_WEBUI=OFF
  - common_params::webui default made conditional on LLAMA_WEBUI_DEFAULT_ENABLED macro
  - Bug fix in common_reasoning_budget_clone (no API change)
  - AMD RDNA3 WMMA flash attention support in CUDA backend
  - Non-function Responses API tools now silently skipped in server-chat

Implementation Details

The LLAMA_BUILD_WEBUI=OFF cache variable is set before FetchContent_MakeAvailable() to prevent upstream from downloading WebUI assets during the build. This aligns with the project's build strategy and avoids unnecessary network I/O.

All upstream changes are either internal (CUDA backend, server implementation) or non-breaking for the Java bindings. No JNI code modifications required.

https://claude.ai/code/session_01DVizDEtXBVDaXciEoo9a8v

Also adds LLAMA_BUILD_WEBUI=OFF before FetchContent to prevent the new build-time WebUI asset download introduced in b9172 from running during CI/local builds. No JNI-level API changes were required. https://claude.ai/code/session_01DVizDEtXBVDaXciEoo9a8v

claude · 2026-05-15T23:06:58Z

Review Summary

✅ Strengths

Consistent version updates: All references to llama.cpp version are correctly updated from b9151 to b9172 across CMakeLists.txt, README.md, and CLAUDE.md
Proper CMake configuration order: The LLAMA_BUILD_WEBUI=OFF cache variable is correctly placed before FetchContent_MakeAvailable(), ensuring upstream respects the setting
Well-documented changelog: The CLAUDE.md additions clearly document the upstream changes and their project impact
Minimal, focused PR: The changes are scoped appropriately for a version upgrade with no unnecessary modifications

⚠️ Question for Clarification

WebUI asset provisioning: The CLAUDE.md documentation indicates two upstream mechanisms for WebUI provisioning:

LLAMA_USE_PREBUILT_WEBUI=ON (default) — downloads prebuilt assets from HF bucket
LLAMA_BUILD_WEBUI — builds assets from source

The PR sets LLAMA_BUILD_WEBUI=OFF (disabling source build), but since LLAMA_USE_PREBUILT_WEBUI=ON is the default, will the build still attempt to download prebuilt WebUI assets from the HF bucket?

Clarification needed: Should the configuration also include set(LLAMA_USE_PREBUILT_WEBUI OFF CACHE BOOL "" FORCE) to fully prevent WebUI asset downloads and achieve the stated goal of avoiding unnecessary network I/O?

If the current setting is confirmed to work as intended (i.e., setting only LLAMA_BUILD_WEBUI=OFF prevents all WebUI provisioning), please verify this is tested in the CI pipeline.

Summary

The PR is technically sound and well-documented. The clarification above is the only item that should be addressed before merging.

claude · 2026-05-15T23:07:04Z

 set(GGML_FMA     ON  CACHE BOOL "" FORCE)
 set(GGML_F16C    ON  CACHE BOOL "" FORCE)
 set(GGML_AVX512  OFF CACHE BOOL "" FORCE)
+set(LLAMA_BUILD_WEBUI OFF CACHE BOOL "" FORCE)


Consider whether LLAMA_USE_PREBUILT_WEBUI also needs to be disabled. According to CLAUDE.md, upstream defaults to LLAMA_USE_PREBUILT_WEBUI=ON which downloads prebuilt WebUI assets. Setting LLAMA_BUILD_WEBUI=OFF prevents building from source, but the prebuilt download might still occur. If the goal is to fully avoid network I/O, you may also need:

set(LLAMA_USE_PREBUILT_WEBUI OFF CACHE BOOL "" FORCE)

Or confirm that LLAMA_BUILD_WEBUI=OFF alone is sufficient to prevent all WebUI asset provisioning.

bernardladenthin merged commit 98174fe into main May 15, 2026
4 of 16 checks passed

bernardladenthin deleted the claude/update-b9172-compatibility-I3YnI branch May 15, 2026 23:06

claude Bot reviewed May 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade llama.cpp from b9151 to b9172#143

Upgrade llama.cpp from b9151 to b9172#143
bernardladenthin merged 1 commit into
mainfrom
claude/update-b9172-compatibility-I3YnI

bernardladenthin commented May 15, 2026

Uh oh!

Uh oh!

claude Bot commented May 15, 2026

Uh oh!

claude Bot May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bernardladenthin commented May 15, 2026

Summary

Changes Made

Implementation Details

Uh oh!

Uh oh!

claude Bot commented May 15, 2026

Review Summary

✅ Strengths

⚠️ Question for Clarification

Summary

Uh oh!

claude Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants