Skip to content

Compile bug: BUILD_SHARED_LIBS=ON + GGML_CUDA=ON fails at llama-server link with undefined references to cuMem*/cuDevice* (missing INTERFACE propagation of CUDA::cuda_driver) #23357

@aamsellem

Description

@aamsellem

Description

Building ggml-org/llama.cpp master / tag b9219 with BUILD_SHARED_LIBS=ON + GGML_CUDA=ON fails at the final llama-server / llama-cli link with undefined references to CUDA Driver VMM symbols. The shared libggml-cuda.so itself builds fine (default --allow-shlib-undefined), but anything downstream linking against it can't resolve the cu* symbols transitively.

[100%] Linking CXX executable ../../bin/llama-server
/usr/bin/ld: ../../bin/libggml-cuda.so.0.12.0: undefined reference to `cuMemCreate'
/usr/bin/ld: ../../bin/libggml-cuda.so.0.12.0: undefined reference to `cuMemAddressReserve'
/usr/bin/ld: ../../bin/libggml-cuda.so.0.12.0: undefined reference to `cuMemUnmap'
/usr/bin/ld: ../../bin/libggml-cuda.so.0.12.0: undefined reference to `cuMemSetAccess'
/usr/bin/ld: ../../bin/libggml-cuda.so.0.12.0: undefined reference to `cuDeviceGet'
/usr/bin/ld: ../../bin/libggml-cuda.so.0.12.0: undefined reference to `cuMemAddressFree'
/usr/bin/ld: ../../bin/libggml-cuda.so.0.12.0: undefined reference to `cuGetErrorString'
/usr/bin/ld: ../../bin/libggml-cuda.so.0.12.0: undefined reference to `cuDeviceGetAttribute'
/usr/bin/ld: ../../bin/libggml-cuda.so.0.12.0: undefined reference to `cuMemMap'
/usr/bin/ld: ../../bin/libggml-cuda.so.0.12.0: undefined reference to `cuMemRelease'
/usr/bin/ld: ../../bin/libggml-cuda.so.0.12.0: undefined reference to `cuMemGetAllocationGranularity'
collect2: error: ld returned 1 exit status

Root cause

ggml/src/ggml-cuda/CMakeLists.txt line 181 already does:

target_link_libraries(ggml-cuda PRIVATE CUDA::cuda_driver)

…but the dependency is PRIVATE, so it satisfies the link of libggml-cuda.so itself (with --allow-shlib-undefined) but does not propagate to downstream targets (ggml, llama, llama-server, llama-cli) that link against ggml-cuda shared.

Result: libggml-cuda.so ships with unresolved cu* driver symbols, then ld refuses to link llama-server against it because the closure isn't satisfied.

Reproducer

FROM nvidia/cuda:13.0.0-devel-ubuntu24.04 AS builder
RUN apt-get update && apt-get install -y --no-install-recommends \
        build-essential cmake git curl ca-certificates \
        libcurl4-openssl-dev libgomp1 python3 python3-pip \
    && rm -rf /var/lib/apt/lists/*
WORKDIR /build
RUN git clone --depth 1 --branch b9219 https://github.com/ggml-org/llama.cpp.git .
RUN cmake -B build \
    -DCMAKE_BUILD_TYPE=Release \
    -DBUILD_SHARED_LIBS=ON \
    -DGGML_CUDA=ON \
    -DGGML_CUDA_FA=ON \
    -DGGML_CUDA_FA_ALL_QUANTS=ON \
    -DLLAMA_CURL=ON \
    -DCMAKE_CUDA_ARCHITECTURES="120"
RUN cmake --build build --config Release -j$(nproc) --target llama-server llama-cli

Build via docker buildx build --platform linux/amd64 . on macOS / qemu. Same failure on tag b9219 (2026-05-18) and b9180 (2026-05-16 MTP merge), so this is not a recent regression — it's a long-standing scope bug only triggered by the BUILD_SHARED_LIBS=ON + non-GGML_BACKEND_DL build path.

Working fix

Anbeeld hit the same bug on a downstream fork and landed the proper CMake change as commit 2b9aa77aa6, described in Anbeeld/beellama.cpp#18. The minimal patch — in ggml/src/ggml-cuda/CMakeLists.txt near the existing CUDA::cuda_driver link line:

target_link_libraries(ggml-cuda PRIVATE CUDA::cuda_driver)
if (NOT GGML_BACKEND_DL)
    target_link_libraries(ggml-cuda INTERFACE $<LINK_ONLY:CUDA::cuda_driver>)
endif()

This:

  • Leaves the dynamic-backend (GGML_BACKEND_DL=ON) path unchanged
  • Adds an INTERFACE LINK_ONLY propagation for the shared BUILD_SHARED_LIBS=ON path, so executables linking against ggml-cuda automatically get libcuda.so on their link line
  • Mirrors what the installed package config does in ggml-cuda-config.cmake for VMM

I verified this on the same Ubuntu 24.04 / CUDA 13.0 / CMAKE_CUDA_ARCHITECTURES=120 (Blackwell consumer sm_120, RTX 5090M) reproducer: with Anbeeld's fix applied, llama-server and llama-cli link cleanly without any linker workaround.

Workaround (no source change)

For users hitting this today, the build can be unblocked by pushing the driver lib onto the executable/shared link lines via CMake:

cmake -B build \
    -DBUILD_SHARED_LIBS=ON -DGGML_CUDA=ON \
    -DCMAKE_EXE_LINKER_FLAGS="-L/usr/local/cuda/lib64/stubs -lcuda" \
    -DCMAKE_SHARED_LINKER_FLAGS="-L/usr/local/cuda/lib64/stubs -lcuda" \
    ...

Context

Hit while building a custom image to test the new MTP support (#22673) on Olares One (RTX 5090M / sm_120 Blackwell consumer mobile). The official ghcr.io/ggml-org/llama.cpp:server-cuda13 rolling tag is still at b9191 (pre-MTP merge → post-MTP merge transition published 2026-05-17) and hasn't moved past it, so building locally is currently the only way to get b9180+ for Blackwell mobile.

Happy to send a PR with the target_link_libraries(... INTERFACE ...) line if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions